J* 




arr*' f ' 

• * . % 


‘•v- !•' •• .- * «.•: IrT W 

' 


vC*V. 






?7 





P *« 









A?:*? 

- ' J - —. 




‘HMi 









(* - 






CJGcM LAMSit)yvrvi- ■*. -- - ‘ 4 * v. v -. < 





’ •- - KT. ~ - ■; ••- *, r,>t. J 














% * 












<rr«..-;V 




FWV_fc ^ .*' 

a-AN? i it 


V •/• 




•jT> '•T* - •.*-* e :. j ■*■' i 

Mw«5a> ySuk 
5{9ff5™&5oar£-> ' h / ^ : • r . 

WSffiwSpi ?*> •;-? . *v •' 






Xtog&n , • •■ * ." *v 

- ' .' v Vvi • * ;. J * • * i v* ' ‘f - • 




Amt/ ► V' -CvS V'-< 

0 -"V;/.- ,-; ... 




:•: 


f <. • V 9 








K’yytjL K- * > -,. 




anj yjQ 




tOt 




QLLflHQ 



V * 


Sftr* A 


L LIBRARY 



60962 








■ i w if 1 r, * 

( fV *A 








V> 




■yfc®&/.?■'’■ ' 1 • •; -: £„*•,• i s? 

w5ix£ 


' : w 


?97F- 




-ij .,•■*- 

* * : ^ \A-u 











::52&U, 








.. 1' .’ •;■ 








yra*Vv\»- .VV; . 

*9 -Bp®®® 
3R5 m w; r vfey 



















Econometrics 


nil FLOOR 


A WILEY PUBLICATION 
IN ECONOMICS 

Kenneth E. Boulding 

Consultant in Economics 



Econometrics 


GERHARD TINTNER 

Professor of Economics, Mathematics, 
and Statistics 

Iowa State College of Agriculture and 
Mechanic Arts, Ames, Iowa 


JOHN WILEY & SONS, INC., NEW YORK 

LONDON 


TOPPAN COMPANY, LTD., TOKYO, JAPAN 



60962 



Authorized reprint of the edition published by John 
Wiley & Sons, Inc., New York and London. 

Copyright © 195 2 by John Wiley & Sons, Inc. 

All Rights Reserved. No part of this book may 
be reproduced in any form without the written 
permission of John Wiley & Sons, Inc. 

Library of Congress Catalog Card Number: 51*13006 


Printed in Japan 

t?y TOPPAN PRINTING COMPANY, LTD 









LEONTINE 




Preface 


Econometrics is a method now widely used in economic research. It 
consists in the application of modern statistical procedures to theoretical 
models, which have been formulated in mathematical terms. These 
methods are of interest in connection with the verification of economic 
laws and also are potentially useful for economic policy. 

Part 1 of the present book gives a non-technical introduction to econ¬ 
ometrics. Certain problems of methodology are considered, and various 
fields of applications are discussed with the help of examples. 

Part 2 deals with various statistical procedures which come under the 
general heading of multivariate analysis. They may all be considered 
as generalizations of the classical method of least squares. These 
methods appear very promising for certain econometric problems and 
are illustrated by applications to economic data. The time series nature 
of the data is, however, neglected; there is no consideration of the mutual 
interdependence of subsequent terms of the series analyzed. 

Part 3 deals with this very difficult and challenging problem. It 
should be realized that the methods given there are tentative and some¬ 
times not very satisfactory from the point of view of modern statistics. 

Some subjects which are connected with econometrics have been 
omitted. There is only incidental discussion of mathematical economics, 
which forms the theoretical framework of econometrics. The main 
emphasis is on the analysis of data which come in the form of time 
series. There is not much on cross-section studies, which are also 
potentially useful in econometrics (see, however, sections 3.4, 3.5, 3.6). 
The theoretical discussion of the aggregation problem is omitted, but a 
statistical solution is tentatively offered (section 6.3). The analysis of 
income distributions (Pareto distribution) is omitted. There is also no 
discussion of sample surveys and related purely statistical matters. 

Econometrics is only one of a variety of possible approaches to the 
study of economics. This approach appears promising, but it is impor¬ 
tant that one realize the limitations of the method: (1) Our mathemati¬ 
cal models are still inadequate. (2) The statistical methods are fre¬ 
quently based upon assumptions which are not strictly true for the data 
analyzed. Methods which can deal with the type of situations met with 


V|| 


PREFACE 


• •« 

VIII 

in economics are often not available or not available in the form which 
is desired. It is to be hoped that this book will stimulate research in 
these fields and enable us in the future to make better progress in these 
difficult matters. 

The first part of this book should be accessible to readers with 
very little mathematical background who are acquainted with modern 
economics. The remaining two parts require, apart from a good knowl¬ 
edge of modern economic theory, some familiarity with the elements of 
the calculus and mathematical statistics. 

Two appendices deal with matrices and determinants and numerical 
computations. They should be useful to readers unacquainted with 
these subjects, which are now indispensable for a serious study of 
modern statistics. 

Statistical methods are of paramount importance for econometrics. 
This book concentrates upon these matters, as far as they are useful 
or at least promising for applications to economic data. Mathematical 
derivations are, however, omitted, but can be found in the literature 
cited. Only a selected number of those subjects have been included 
that are of actual or potential use in econometric studies. 

It would have been logically and esthetically more satisfying to pre¬ 
sent the subject matter of econometrics in a more systematic form from 
the economic point of view. A textbook or treatise on economics 
written from the point of view of econometrics would be a very valu¬ 
able contribution to economic science. The field of econometrics, how¬ 
ever, seems not yet well enough developed for the presentation of the 
subject matter in this form. The main difficulties in econometrics are 
still statistical. For this reason we have concentrated on statistical sub¬ 
jects and tried to present them in a somewhat systematic form. 

The two first parts of this book were presented as a course of lec¬ 
tures at the University of Cambridge (England), 1948-49. Portions 

of Part 2 were the subject of a seminar at the University of Uppsala 
(Sweden) in 1949. 

I am very grateful to my colleague, J. Nordin (Iowa State College), 
for advice and criticism. I am obliged to W. G. Murray, T. A. Ban¬ 
croft, and D L. Holl of Iowa State College for help in connection with 
the manuscript of this book, also to C. H. Brown and R. W. Orr of 
the Iowa State College Library for bibliographical assistance. 

In connection with Part 2, I am indebted to W. G. Cochran (Johns 
Hopkins), O. Brownlee (University of Minnesota), L. Hurwicz (Uni¬ 
versity of Minnesota), A. M. Mood (Rand Corporation), H. Hotelling 
(University of North Carolina), J. Marschak (Cowles Commission), 



PREFACE 


IX 


T. Koopmans (Cowles Commission), L. R. Klein (University of Michi¬ 
gan), R. C. Geary (Dublin, Ireland). M. S. Bartlett (Universi y 
Manchester), T. W. Anderson (Columbia University), and the late 

A. Wald (Columbia University). . 

In connection with Part 3, I am indebted to R. Stone (Cambridge 

University), F. J. Anscombe (Cambridge University). H. E. Dames 

(Cambridge University), M. S. Bartlett (University of Manchester). 

and D. G. Kendall (Oxford University) for advice and criticism. 

Gerhard Tintner 


Ames, Iowa 
January, 1952 




Contents 



PART 1. A NON-TECHNICAL INTRODUCTION 

TO ECONOMETRICS 


CHAPTER __ 

1 SCOPE AND METHOD OF ECONOMETRICS 

1.1 Economics and Econometrics 

1.2 Econometrics and Statistics 

2 A SHORT SKETCH OF REGRESSION METHODS 

3 SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 

3.1 Demand Functions 

3.2 Supply Functions 

3.3 Cost Functions 

3.4 Production Functions 

3.5 Utility Functions and Engel Curves 

3.6 Tableau Economique 

3.7 Static Models for the Total Economy 

3.8 Dynamic Models of the Economy 

4 THE PRACTICAL IMPORTANCE OF ECONOMETRICS 


3 

3 

14 

24 

36 

36 

43 

47 

51 

57 

63 

66 

69 

76 


PART 2. INTRODUCTION TO MULTIVARIATE ANALYSIS 


5 MULTIPLE REGRESSION AND CORRELATION 

5.1 Elements of Multiple Regression 

5.2 Distributions 

5.3 A Test for Linear Relations 

5.4 Partial Correlations 

6 SOME APPLICATIONS OF MULTIVARIATE ANALYSIS TO 

ECONOMIC DATA 

6.1 Notation 

6.2 Discriminant Analysis 

6.3 Principal Components 

6.4 Canonical Correlations 

6.5 Weighted Regression 


83 

83 

86 

89 

91 


93 

95 

96 
102 
I 14 
121 


XI 



CONTENTS 


• • 

XII 

CHAPTER 

7 STOCHASTIC MODELS WITH ERRORS IN THE EQUATIONS 154 

7.1 Identification 155 

7.2 Estimation of Equations Which Are Just Identified 166 

7.3 Estimation of a Single Overidentified Equation 172 

PART 3. SOME TOPICS IN TIME SERIES ANALYSIS 

8 THE TREND 189 

8.1 Orthogonal Polynomials 190 

8.2 Moving Averages ] 9 g 

8.2.1 The Use of Moving Averages 198 

8.2.2 Successive Smoothing by Moving Averages 202 

8.2.3 Moving Averages and the Random Element 203 

8.2.4 The Effect of Moving Averages on the Amplitude of Peri¬ 

odic Movements 207 

8.3 Fitting the Logistic 208 

8.4 Non-Parametric Tests for the Trend 211 

9 OSCILLATORY AND PERIODIC MOVEMENTS 216 

9.1 Fourier Analysis 217 

9.2 Pcriodogram Analysis 222 

9.3 Wald’s Method of Eliminating Seasonal Fluctuations 227 

9.4 A Non-Parametric Test for Cyclical Fluctuations 234 

10 THE INTERDEPENDENCE OF SUCCESSIVE OBSERVATIONS 239 

10.1 Autocorrelation 240 

10.1.1 Wald-Wolfowitz Non-Parametric Test 240 

10.1.2 Anderson’s Test for Autocorrelation in a Circular Uni¬ 

verse 242 

10.1.3 Relations between Autocorrelated Series 247 

10.1.4 Autocorrelation of Residuals 250 

10.2 Ratio of Mean Square Successive Difference to the Variance 252 

10.3 Stochastic Difference Equations (Linear Autoregression) 255 

10.3.1 First-Order Difference Equations 255 

10.3.2 Second-Order Difference Equations 260 

10.3.3 One Single Difference Equation of Arbitrary Order 265 

10.3.4 Systems of Stochastic Difference Equations 267 

10.3.5 Lag Correlation (Serial Correlation) 269 

10.3.6 Stochastic Difference Equations with Errors in the Vari¬ 

ables 272 

10.3.7 Stochastic Difference Equations and Process Analysis 275 

10.4 Stochastic Differential Equations and General Stochastic Processes 277 

1-0.5 The Least Squares Method for Correlated Errors 279 

10.6 Correlogram Analysis 284 

10.6.1 Moving Averages 285 

10.6.2 Linear Autoregression (Stochastic Difference Equations) 294 



CHAPTER 


CONTENTS 


XIII 


11 THE TRANSFORMATION OF OBSERVATIONS 301 

11.1 Trends in Multiple Regression 301 

11.1.1 Linear Trends 301 

11.1.2 Polynomial Trends 304 

11.2 Variate Difference Method 308 

11.2.1 The Assumptions 309 

11.2.2 Large Sample Test 310 

11.2.3 Exact Test 3 14 

11.2.4 Reduction of the Random Variability 319 

11.3 Autoregressive Transformations 323 


APPENDICES 

A.l Elements of Matrices and Determinants 331 

A.1.1 Matrices 331 

A. 1.2 Linear Equations; Determinants 335 

A.2 Methods of Numerical Computation 342 

A.2.1 The Crout Method for Solving Systems of Linear Equations 342 

A.2.2 Computation of Determinants 348 

A.2.3 Inversion of Matrices 348 

A.2.4 Powers of a Matrix 351 

Index of Names 359 

Index of Subjects 364 




Part I 


A Non-Technical Introduction 
to Econometrics 


In the first part of this book we intend to present the science of econo¬ 
metrics for the economist with little mathematical and statistical training. 
We will indicate the general position of econometrics in the hierarchy of 
the sciences (Chapter 1). Then we will present a short sketch of regression 
methods (Chapter 2). Finally, we intend to illustrate econometrics, to 
discuss some examples of the application of econometrics to various 
concrete economic problems (Chapter 3), and to indicate the practical 
importance of econometrics (Chapter 4). 




Chapter I 


Scope and Method of Econometrics 


A main source of information about the progress of econometrics is 
Econometrica , the journal of the Econometric Society, published since 
1933. A German journal published since 1950 is Zcitschrift fur Oekono- 
metrie, and an Italian journal is Metroeconomica. A survey of econo¬ 
metrics in English is H. T. Davis' Theory of Econometrics (Bloomington, 
Ind., 1941), and the chapter by W. W. Leontief on econometrics in A 
Survey of Contemporary Economics, edited by H. S. Ellis (Philadelphia, 
1948). An introduction in Dutch is given by J. Tinbergen in Econometric 
(Gronichen, 1941). An English translation is Econometrics (New York. 
1951). An excellent survey in German is a book by W. Winkler, 
Grundjragen iter Oekonometrie (Vienna, 1951). 

The mathematical equipment necessary for the econometrician can most 
conveniently be acquired by studying R. G. D. Allen’s Mathematical 
Analysis for Economists (London, 1938), or the author’s Mathematics and 
Statistics for Economists (New York, 1953). 

Statistical methods, especially the method of multiple regression, should 
be studied by reading, e.g., M. Ezekiel's Methods of Correlation Analysis 
(2nd ed.. New York, 1941). A more comprehensive survey of modern 
statistics is M. G. Kendalls 1 he Advanced Theory of Statistics, vols. I 
and 2 (London, 1946). Sec also G. U. Yule and M. G. Kendall's An 
Introduction to the Theory of Statistics (14th ed.. New York, 1950). 

1.1 Economics and Econometrics 

Econometrics is the application of a specific method in the general field 
ot economic science in an effort to achieve numerical results and to verify 
economic theorems. 1 It consists in the application of mathematical 

J. Schumpeter: “The common sense of econometrics,” Econometrica, vol. I 
(1933), pp. | fT. R. hnsch: “The responsibility of the econometrician,” ibid .. 
vol. 14(1946), pp. I ff. J. Tinbergen: Econometric (Gronichen, 1941); Econo- 

rr'7u^ eVV 1°*’ ,95,) - H - T - ° avis: Theory of Econometrics ( Bloomington, 
nd . 1941). W W. Leontief: "Econometrics" in A Survey of Contemporary 
Economics (Philadelphia. 1948). G. Tintner: "Scope and method of 

3 



4 


SCOPE AND METHOD OF ECONOMETRICS 


[II 


economic theory and statistical procedures to economic data in order to 
establish numerical results in the field of economics and to verify economic 
theorems. 

Econometrics has to be distinguished from mathematical economics 
and from statistical economics. It is, however, closely related to both 
and utilizes results achieved in these fields. 

Mathematical economics formulates economic theory in mathematical 
terms and uses the methods of mathematics to derive economic relation¬ 
ships from certain basic assumptions or axioms, e.g., from certain ideas 
about maximization. 2 

We assume, for instance, that the consumer maximizes his utility, while 
the prices of all commodities and services and his income are given inde¬ 
pendently of his actions. 3 From this assumption we can derive certain 
properties of the individual demand functions for all commodities and 
services. Suppose that we have information relevant to these individual 
demand functions; e.g., we have certain data from family budgets. Then 
we may test statistically the validity of certain theories, regarding, for 
instance, the elasticities of the demand for various goods and services 
with respect to the prices and income. If we have only global data, e.g., 
census data, the problem becomes much more complicated. We are 
faced with a general problem in the field of index numbers, a problem of 
aggregation. 4 Not much progress has as yet been made in this field, in 


econometrics, illustrated by applications to American agriculture," Statistical 
and Social Inquiry Society of Ireland, vol. 18 (1948), pp. 161 ff.; "La position 
de Leconometrie dans la hierarchie des sciences sociales," Revue (/'economic 
politique, vol. 59 (1949), pp. 634 fT. B. C'hait: Sur T econometric (Brussels, 
1949). F. Divisia: Technique et statistique (Paris, 1941). R. Stone: The Role 
of Measurement in Economics (Cambridge, 1951). W. Winkler: Grundfrayen 
der Oekonometrie (Vienna, 1951). 

2 F. Kaufmann: Methodology of the Social Sciences (London, 1944), pp. 
141 fT. M. Allais: “L'emploi des mathematiques en economique,” Metro- 
economica. vol. I (1949), pp. 63 fT. G. Demaria: Principi yenerali di loyica 
economica (Milan, 1947). J. RuefT: Front the Physical to the Social Sciences 
(London, 1929). 

3 K. W. Rothschild: "A note on the meaning of rationality," Review of 
Economic Studies, vol. 14 (1946), pp. 50 fT. 

4 F. Divisia: Economic rationelle (Paris, 1928), pp. 260 fT. L. R. Klein: 
"Macro-economies and the theory of rational behavior," Econometrica, vol. 14 
(1946), pp. 93 ff'. K. May: "The aggregation problem in a one-industry model," 
ibid., pp. 285 ff. S. S. Pou: "A note on macro-economics," ibid., pp. 299 ff. 
L. R. Klein: "Remarks on the theory of aggregation," ibid., pp. 303 fT. K. May: 
"Technological change and aggregation,” ibid., vol. 15 (1947), pp. 51 ff. 



ECONOMICS AND ECONOMETRICS 


5 


/./] 


spite of very promising beginnings. A statistical approach to this problem 
will be presented in section 6.3. 

A related example is the position of a firm or enterprise which maximizes 
its profits under free competition, monopoly, monopsony, bilateral 
monopoly, or more complicated market configurations. The question of 
profit maximization has been discussed by Boulding. 5 Again, we can 
derive certain properties of the demand functions of the firm for all factors 
of production and the supply functions of all products. If we have 
suitable data, we may try to use econometric, i.e., statistical, methods to 
derive price elasticities of demand and of supply, or test hypotheses about 
these elasticities, etc. 

A more complicated system which deals with the interdependence of 
economic units and uses very ingeniously the newly developed theory of 
games of strategy has been given by Von Neumann and Morgenstern. 6 
This new theory is not yet in a state which permits statistical verification. 

Economic theory is not merely descriptive and certainly does not aspire 
to the impossible goal of giving a complete description of economic 
phenomena. 7 On the basis of certain simplifications it constructs models 

W. W. Leontief: “Introduction to a theory of the internal structure of functional 
relationships,” ibid., pp. 361 ff. A. Nataf: “Sur la possibility de la construction 
de certaines macromodeles,” ibid., vol. 16 (1948), pp. 330 ff. A. Nataf and 
R. Roy: “Remarques et suggestions relatives aux nombres indices,” ibid., 
pp. 330 fT. K. E. Boulding: A Reconstruction of Economics (New York, 1950), 
pp. 171 ff. L. R. Klein: Economic Fluctuations in the United States, 1921-41 
(New York, 1950), pp. 19 fT. B. D. Mudgett: Index Numbers (New York, 1951). 
M. J. Ullmer: The Theory of Cost of Living Index Numbers (New York, 1949). 

5 K. E. Boulding: “The incidence of a profits tax,” American Economic 
Review , vol. 34 (1944), pp. 567 ff. F. A. Lutz: “The criterion of maximum 
profits in the theory of investment,” Quarterly Journal of Economics, vol. 60 
(1945), pp. 56 ff. C. Hildreth: “A note on maximization criteria,” ibid., vol. 61 
(1946), pp. 156 ff. F. and V. Lutz: The Theory of Investment of the Firm (New 
York, 1951), pp. 16 ff. 

J. Von Neumann and O. Morgenstern: Theory of Games and Economic 

Behavior (2nd ed., Princeton, 1947). J. F. Nash: “The bargaining problem," 

Econometriea, vol. 18 (1950), pp. 155 ff. L. Hurwicz: “The theory of economic 

behavior,” American Economic Review , vol. 35 (1945), pp. 909 ff. J. Marschak: 

“Neumann’s and Morgenstern's new approach to static economics.” Journal 

of Political Economy , vol. 54 (1946), pp. 97 ff. R. Stone: “The theory of 

games,” Economic Journal, vol. 38 (1948), pp. 185 ff. C. Kaysen: “A revolution 

in economic theory,” Review of Economic Studies, vol. 14 (1946), pp. I ff. 

G. Th. Guilbaud: La theorie des jeux,” Economic appliance , vol. 2 (1949) 
pp. 275 ff. 

7 F. Kaufmann: op. cit., pp. 73 ff. 



6 


SCOPE AND METHOD OF ECONOMETRICS 


[/./ 


which are supposed to reproduce certain aspects of the economic reality. 
These models are in the nature of ideal types. 8 With the help of these 
models we may try to derive certain laws which are supposed to embody 
some regularities of economic behavior. Econometric methods enable 
us to formulate these laws numerically, to test them statistically, and also 
to get an idea about the adequacy of the models. 

For example, a question which has been much discussed in recent years 
is the nature of the supply function for labor (section 6.5, Example 6). 9 
Does this function depend upon real wages or upon money wages? We 
can set up a more or less comprehensive model of the total economy, of 
which the labor market forms a part. If we have adequate data, we may 
try to derive the supply function of labor by statistical methods. Having 
estimated the supply function, we can then again use statistical procedures 
to test the hypothesis that it depends upon real wages, i.e., that it is 
homogeneous of zero degrees in wages and prices (see sections 5.3 and 6.5). 
Suppose that the probability that the differences between the empirical 
results and the hypothesis are due to pure chance is large, say more than 
I per cent. Then we may not reject the hypothesis and come to the 
conclusion that the supply of labor depends probably upon real wages 

and not on money wages. This result is of some importance in the 
Keynesian discussion. 10 

Statistical analysis may also reveal that the theoretical model is not 
adequate. For instance, the regression coefficients may not be statistically 
significant; we may get the wrong signs compared with the ones postulated 
by economic theory, etc. Then we may decide to use a different model, 
e.g., a dynamic model (see section 3.8), where the supply of labor depends 
upon past wages and prices as well as upon contemporary wages and 

8 T. Parsons: The Structure of Social Action (New York, 1937) pp. 601 ff. 

A. von Schelting: Max Webers Wissenschaftslehre (Tuebingen, 1934). M. 
Weber: Gesammelte Aufsaetze zur Wissenschaftslehre (Tuebingen, 1922), pp. 

146 ff. A. Schuetz: Der sinnhafte Aufbau der sozialen Welt (Vienna, 1932), 
pp. 161 ff. 

J • Tintner: An Econometric Investigation of the British Labor Market 
(unpublished essay). 

10 W. W. Leontief: “The fundamental assumptions of Keynes' monetary theory 
of unemployment," Quarterly Journal of Economics, vol. 51 (1936), pp. 136 ff.; 
“Postulates: Keynes' general theory and the classicists,” in S. E. Harris, ed.: 
The New Economics (New York, 1947), pp. 232 ff. G. Tintner: “Homogeneous 
systems in mathematical economics," Econometrica , vol. 16 (1948), pp. 273 ff.; 
“Static macro-economic models and their econometric verification," Metro- 
economica , vol. 1 (1949), pp. 48 ff. W. Fellner: Monetary Policies and Full 
Employment (Berkeley, 1946), pp. 94 ff., 109 ff. 



ECONOMICS AND ECONOMETRICS 


7 


/./] 


prices. This model can again be investigated by appropriate statistical 
procedures. Some of these methods will be discussed in Part 3. 

There is in our opinion no fundamental difference between mathe¬ 
matical economics and economic theory which utilizes non-mathematical 
methods. 11 Many economic theorems have first been formulated in a 
literary way, and later restated in mathematical terms. The best example 
of such a procedure is perhaps the Keynesian 12 theory, which was first 
stated by Keynes in a non-mathematical form. Later it was reformulated 
by many mathematical economists in terms ot mathematical equations. 11 
This reformulation has certain advantages compared with the original 
theory. It has brought out the basic hypotheses and has enabled us 
to see more clearly the difference between Keynesian and what has come 
to be called “classical” economics. 

Many non-mathematical economists, e.g., Marx 14 and Bohm-Bawerk, 15 
have used the method of numerical examples, which is “almost” mathe¬ 
matical. But it is apparent that mathematical treatment of the problems 
involved is greatly preferable. It gives increased generality to the results 
and brings out at once more clearly the basic hypotheses underlying the 
theory and its limitations. We do not believe there can be any question 
that, for instance, the mathematical reformulation of the Bohm-Bawerkian 


11 F. Kaufmann: op. cit ., pp. 90 ff. 

12 J. M. Keynes: The General Theory of Employment , Interest and Money 
(London, 1936). 

13 See, e.g., J. R. Hicks: “Mr. Keynes and the classics," Econometrica , vol. 
5 (1937), pp. 147 IT. O. Lange: “The rate of interest and the optimum pro¬ 
pensity to consume," Economica . new series, vol. 5 (1938), pp. 1 2 ff. F. Modig¬ 
liani: “Liquidity preference and the theory of money,” Econometrica , vol. 12 
'1944), pp. 45 fi\ L. R. Klein: The Keynesian Revolution (New York, 1947), 
technical appendix, pp. 189 IT. J. E. Meade: “A simplified model of Mr. 
Keynes' system,” Review of Economic Studies , vol. 4 (1936), pp. 98 ff. G. 
Lutfalla: “La querelle des classiques et modernes," Revue d'economic politique, 
vol. 57 (1947), pp. 361 ff. G. Tintner: “Static macro-economic models and 
their econometric verification,” Metroeconomica , vol. 1 (1949), pp. 48 ff. S. E. 
Harris: “In relation to classical economics: evolution or revolution," in S. E. 
Harris, ed.: The New Economics (New York, 1948), pp. 55 ff. 

14 K. Marx: Capital , vol. 2 (Chicago, 1907), pp. 453 ff. See also P. M. 
Sweezy: The Theory of Capitalist Development (New York, 1942). L. R. Klein: 
“Theories of effective demand and employment," Journal of Political Economy , 
vol. 55 (1947), pp. 108 ff. 

15 E. von Bohm-Bawerk, Positive Theory of Capital (New York, 1930), pp. 
260 ff. 



8 


SCOPE AND METHOD OF ECONOMETRICS 


[LI 


capital theory by Wicksell 16 is preferable to the original statement by 

Bohm-Bawerk. The same might be said about the post-Keynesian 
formulations already mentioned. 

The relation between logic and mathematics is much disputed. We 
believe, however, that Bertrand Russel] 17 has shown that mathematics 
may be considered a branch of logic. No one who is familiar with the 
vast literature of logistics can deny that it is possible to state logical 
relationships symbolically, i.e., in mathematical terms. On the other 
hand, mathematics can be stated and formulated in purely logical terms, 
i.e., without the use of symbols. Mathematics may be based upon logic, 
though logic of a non-Aristotelian kind. Mathematics and logic are 
very closely related, if not identical. Hence it is inconsistent of the 
theoretical economist not to use mathematics. 

All economic laws are conditional. The propositions of welfare econ¬ 
omics ,, 18 in particular, deal with the way in which ideal use of resources 
can be obtained, given a certain social objective. This objective is 
normative, and welfare propositions are conditioned upon it. 

It may be said, for instance, that the lowering of tariffs will under 
certain conditions increase the national product. This proposition may 
or may not be true, but is certainly verifiable at least in principle. It 
can be tested, e.g., by studying cases in which tariffs have actually been 
lowered. Econometric methods may be quite useful in performing such 
a test. But economics does not tell us anything about the question: 
Should tariffs be lowered ? This will depend upon the social goals pursued 
in policy. These goals will in most cases be multiple and not unique. 19 


16 K. Wicksell: Ueber Wert, Kapital und Rente, reprint (London, 1933). 

Russell: Introduction to Mathematical Philosophy (New York, 1919). 
See, however, H. Poincare: Science et methode (Paris, 1908), pp. 152 fT. H. 
Weyl: “Philosophy der Mathematik und der Naturwissenschaften," Handbuch 
der Philosophie (Munich, 1927); Philosophy of Mathematics and Natural 
Science (Princeton, 1949), pp. 50 if. 

,K M. Reder: Studies in the Theory of Welfare Economics (New York, 1947). 
P. A. Samuelson: Foundations of Economic Analysis (Cambridge, Mass., 1947), 
pp. 203 fT. W. J. Baumol: “Community indifference," Review of Economic 
Studies, vol. 14 (1946), pp. 44 fT. O. Lange: “The foundations of welfare 
economics," Econometrica, vol. 10 (1942), pp. 215 fT. G. Tintner: “A note 
on welfare economics," ibid. , vol. 14 (1946), pp. 69 fT. A. C. Pigou: The 
Economics of Welfare (London, 1920). H. Myint: Theories of Welfare Econ¬ 
omics (London, 1948). I. D. M. Little: A Critique of Welfare Economics 
(Oxford, 1950). 

19 T. Parsons: op. c/7., pp. 642 fT. M. Weber: “ Pol it i k als Beruf" in Gesam- 
melte politische Schriften (Munich, 1921). T. de Scitovsky: “A reconsideration 



ECONOMICS AND ECONOMETRICS 


9 


/./] 


Various ends will compete with each other. For instance, the lowering 
of tariffs may be considered desirable because, on the whole, it raises the 
current level of living. Against this has to be balanced a potential loss 
of self-sufficiency, which may be serious for the national existence of a 
country in times of war. The choice of a policy will not depend upon 
economic considerations. These considerations can only supply (he (true 
or false) statement: // tariff's are lowered, the national product will increase. 
Econometrics may even be able to tell us by how much it will probably 
increase. The choice of policy, however, is a matter of politics, ethics, 
and similar considerations. 20 

But why is mathematical economics not enough for the purposes of 
economic science? The work of Marshall, 21 Walras, 22 Pareto, 23 and their 
modern followers 24 is certainly an imposing intellectual achievement. 
But, as Marshall himself has pointed out, 25 something has still to be 
added. We desire numerical results in economics in order to be able 
to make quantitative predictions. This can be done with the help of 
statistics. Statistics is necessary if we want to proceed from the abstract 
formulations of mathematical economics to numerical results. These 
results may also enable us to verify the economic theorems involved. 

In economic theory we derive certain models which include two types 
of relationships. The first kind are purely definitional, for instance: 
Savings plus investment equals income; expenditure on a commodity 
is its price times the quantity bought. These relationships offer no 
problem for the statistician. 

But apart from these definitional relationships there are other structural 
relationships, which describe the behavior of the individuals in the 


of the theory of tariffs,'* Review of Economic Studies , vol. 9 (1941), pp. 89 ff. 
J. de V. Graaf: “On optimum tariff structures," ibid., vol. 17 (1949), pp. 47 ff. 
I. M. D. Little: "Welfare and tariffs," ibid., vol. 16 (1948), pp. 65 ff. R. Aron: 
La sociologie AHeniande contemporaine (2nd ed., Paris, 1950), pp. 97 ff. M. 
Boiteux: "Le revenu distribuable et les pertes economiques," Econometrica , 
vol. 19 (1951), pp. 112 ff. 

20 J. von Kempski: "Wie ist Theorie der Politik moglich," Zeitschrift fur 
die gesantmte Stoat swissenschaft, vol. 106 (1950), pp. 447 ff. 

21 A. Marshall: Principles of Economics (8th ed., London, 1920). 

22 L. Walras: Elements d'economic politique (4th ed., Lausanne, 1926). 

23 V. Pareto: Manuel d'economic politique (2nd ed., Paris, 1927). 

24 See, e.g., G. J. Stigler: Production and Distribution Theories (New York, 
1949). E. Antonelli: L'economic pure du capitalisme (Paris, 1939). G. H. 
Bousquet: Institutes de science economique , vol. 1 (Paris, 1930), pp. 213 ff. 

25 A. Marshall: "On the graphic method of statistics," Jubilee Volume Royal 
Statistical Society (London, 1885), p. 260. 



10 SCOPE AND METHOD OF ECONOMETRICS [/./ 

economy. 26 These are, for instance, production function, supply func¬ 
tions, and demand functions. These structural relationships involve 
structural parameters, which have to be estimated by statistical methods. 
Examples of such parameters are elasticity of demand with respect to 
price, elasticity of demand with respect to income, marginal propensity 
to consume, and marginal productivity. The estimation of these 
parameters is difficult because the empirical economic data are 
actually the result of the interaction of many structural relationships. 
Hence, at least in principle, we have to deal with systems of equations 
(see sections 6.5 and Chapter 7) and only in exceptional cases with single 
equations. 

If we could experiment 27 in economics, as in biology or in physics, we 
could determine the structural relationships by holding constant certain 
variables (say, income with the demand function) and investigate the 
effect of other variables which we vary deliberately in some desired 
way (say, the price of the commodity). But such experiments are not 
possible, and we have to take the interrelationships into account. 
This is necessary if we want to isolate individual causes which operate 
simultaneously. 

The estimation of structural parameters is of paramount importance 
if we want to derive useful results for economic policy. Assume, for 
instance, that we have data on price and quantity which result from the 
interaction of a demand and a supply function on an isolated market. 
The government intervenes and fixes the supply of the commodity. The 
effect of this governmental policy will then depend upon certain structural 
parameters of the demand function, for instance the elasticity of demand. 
Hence, the estimation of this structural parameter is of great importance 
for applications of econometrics in economic policy. 

It is one thing to develop the theoretical concept of the elasticity of 


26 T. Haavelmo: “The statistical implications of a system of simultaneous 
equations,” Econometriea , vol. 11 (1943), pp. 1 ff; “The probability approach 
in econometrics," ibid., vol. 12 (1944), supplement. T. C. Koopmans: “Statis¬ 
tical estimation of simultaneous economic relationships," Journal of the American 
Statistical Association , vol. 40 (1945), pp. 448 ff. J. Marschak: “Statistical 
inference in economics" in T. C. Koopmans, ed.: Statistical Inference in 
Dynamic Economic Models (New York, 1950), pp. 1 ff. 

27 R. A. Fisher: The Design of Experiments (5th ed., Edinburgh and London, 

1949) . W. G. Cochran and G. M. Cox: Experimental Designs (New York, 

1950) . H. B. Mann: Analysis and Design of Experiments (New York, 1950). 
F. J. Anscombe: “The validity of comparative experiments," Journal of the 
Royal Statistical Society , vol. 1 11 (1948), pp. 181 ff. 



ECONOMICS AND ECONOMETRICS 


II 


1 . 1 ] 


demand, as it has been done by Cournot, 2s Marshall, 29 and their followers. 
It is another thing to state that the estimated price elasticity of demand 
for agricultural products in the United States, 1920-43, on the average 
is —0.123 30 (section 6.5, Example 3). This is to say that to an increase 
of 1 per cent in agricultural prices in the United States in the period 
considered corresponds approximately, other things equal, a decrease 
in the total demand of agricultural products of 0.123 per cent, or not 
quite V 8 of 1 per cent. 

Using more refined methods of statistical analysis, we can establish the 

y_. 

95 per cent confidence or fiducial limits of the elasticity of demand 31 (see 

section 1.2). They are —0.052 and —0.195. 

Wc can state the following: Other things equal, to an increase of agri¬ 
cultural prices of I per cent corresponds a decrease of demand for agri¬ 
cultural products which is not smaller than about 1 ' 2 o an ^ n °t larger than 
about Vs of 1 per cent. This statement and other similar statements 
about confidence limits have a chance to be right in about 95 per cent 
of all cases, in the long run, on the average. 

This result is certainly interesting to the pure economist who has always 
suspected that the demand for agricultural products is rather inelastic, 
i.e., to an increase of 1 per cent in the price corresponds, other things 
equal, a decrease in demand of less than 1 per cent. 32 This is a verifica¬ 
tion of an economic theorem. But it is also of interest to the economist 
w'ho engages in the study of economic policy. It gives some idea about 
the consequences of an increase or decrease in agricultural prices, as far 
as demand for agricultural products is concerned. 

Assume, for instance, that the American government decides to raise 
agricultural prices by 10 per cent. This may be done by price fixing. 
If the results of the statistical analysis are reliable, the quantity of agri¬ 
cultural products demanded can be expected, ceteris paribus , to decline 
by about 1 per cent. This result may be considered desirable or not 
desirable, depending on the social ends which are pursued in economic 
policy. Under certain circumstances we may consider the decline in 
consumption of agricultural commodities negligible, compared with the 


28 A. Cournot: Researches into the Mathematical Principles of the Theory 
of Wealth (New York. 1927), pp. 56 ff.; Recherches sur les principes mathe- 
maticfues cle la theorie des richesses , G. Lutfalla, ed. (Paris, 1938), pp. 61 ff. 

20 A. Marshall: Principles of Economics (8th ed., London, 1920), pp. 102 ff. 

30 G. Tintner : “Multiple regression for systems of equations," Econometrica , 
vol. 14 (1946), pp. 5 ff., esp. p. 34. 

31 Ibid. 

32 A. Marshall: Principles of Economics (8th ed., London, 1920), p. 102. 



12 


SCOPE AND METHOD OF ECONOMETRICS 


[/./ 


benefit accruing to farmers from the increase in prices. On the other 
hand, the decline in consumption may, for instance, present a serious 
threat to the general health of the population. This example shows again 
that various ends pursued in social policy may be in conflict. Economics 
and econometrics can contribute nothing, as far as the choice of a concrete 
policy based upon these ends is concerned. But econometrics can perhaps 
contribute something in giving us numerical estimates of the results of 
the adoption of various possible policies. 

Econometrics has also to be distinguished from statistical economics. 
Statistical economics declines the use of economic theory and claims to 
present a statistical summary of the economic data themselves. 33 This 
has recently been called rather aptly “measurement without theory." 34 

It may be doubted whether such a procedure is fruitful or even possible. 
Some kind of fundamental theoretical ideas underlie even the work of 
the most institutionalist-minded statistical economist. The selection of 
the data, their organization, etc., imply already some kind of underlying 
general conception or theory, even if it is not explicitly stated. Econo¬ 
metrics represents an intermediary position between the extreme non- 
theoretical empiricism of the statistical economists and the non-empirical 
theoretizing of some “pure" economists. 35 

The econometrician has to be very grateful for the work done by the 
statistical economist. For instance, the pioneer work of Kuznets 30 in the 
United States and of Bowley 37 and Stamp 38 in England regarding the 
collection of data about national income is valuable, if we want to make 
use of statistical analysis in order to derive numerical economic laws or 
verify economic theorems involving national income. But the data 

33 See, e.g., A. F. Burns and W. O. Mitchell: Measuring Business Cycles 
(New York, 1946). F. C. Mills: The Behavior of Prices (New York, 1927). 

34 T. C. Koopmans: “Measurement without theory," Review of Economic 
Statistics, vol. 29 (1947), pp. 161 ff. R. Vining and T. C. Koopmans: “Methodo¬ 
logical issues in quantitative economics," Review of Economics and Statistics, 
vol. 21 (1949), pp. 77 ff. 

3<> P. T. Homan: “The institutionalist school," Encyclopaedia of the Social 
Sciences , vol. 5 (New York, 1937), pp. 387 ff. L. von Mises: Grundproh/eme 
der Nationaloekonomie (Jena, 1933): Human Action (New Haven, 1949), pp. 
347 ff., 697 ff., 706 ff. See also L. Robbins, An Essay on the Nature and Signi¬ 
ficance of Economic Science (2nd ed., London, 1949), pp. 59 ff., pp. 106 ff. 

A. G. Papandreou: “Economics and the social sciences," Economic Journal, 
vol. 40 (1950), pp. 715 ff. 

36 S. Kuznets: National Income, a Summary of Findings (New York, 1946). 

37 A. L. Bowley: Studies in National Income, 1924-28 (Cambridge, 1942). 

38 Sir J. Stamp: The National Capita! (London, 1933). 



ECONOMICS AND ECONOMETRICS 


13 


/./] 


themselves are not enough. They can be interpreted and analyzed only 
by the use of theory. Economic theory, especially mathematical econ¬ 
omics, has to provide the framework, and modern statistical analvsis 
has to supply the tools in order to achieve numerical results, which after 
all are the goal of econometrics. 

The aversion of many statistical economists to the use of theory and 
econometric methods is based upon the idea that there are no laws in 
the social sciences, or that these laws are very ephemeral and unstable. 39 
There is a grain of truth in these objections, and the econometricians 
should be careful not to overstate their case. 

The same objection against the existence of laws can also be made in 
the natural sciences. This has sometimes been called the general problem 
of induction. 40 Certainly, the case against the existence of stable laws 
is a better one in the social than in the natural sciences. But ultimately 
only the existence of a body of valid numerical economic relationships, 
which is the goal of econometrics, can disprove the contentions of the 
radical institutionalists and followers of the historical school. 41 

It is not impossible that the case for econometrics has sometimes been 
overstated bv enthusiastic econometricians. Econometrics is a useful 
method of economic research, but certainly not the only one suitable for 
the verification of economic theorems. It can, for instance, throw very 
little light on the problems of economic development, e.g., the origin 
and evolution of the capitalist system. 42 Here historical research is much 
more fruitful, if only because the data are too scarce to allow us the 
successful application of econometric methods. The study of economic 
institutions, especially of the legal framework 43 of economic activity, is 
also a very useful procedure in economic research. An institutionalist 
study of the banking system, 44 of trade union organization, 43 etc., may 

39 L. Robbins, op. cit. 

H. Rcichenbach: Theory of Probability (Berkeley, 1949), pp. 429 ff. 

41 H. Schumacher: “The historical school,” Encyclopaedia of the Social 
Sciences , vol. 5 (New York, 1937), pp. 371 ff. 

42 W. Sombart: “Capitalism,” in Encyclopaedia of the Social Sciences, vol. 
3 (New York, 1937), pp. 195 ff.; Der moderne Kapitalismus , 3 vols. (Munich. 
1921-27). E. D. Domar: “Capital accumulation and the end of prosperity,” 
Econometrica , vol. 17 (1949), supplement, pp. 307 ff. C. Clark: “Theory of 
economic growth,” ibid. , pp. 112 ff. E. D. Domar: “Capital expansion, rate 
of growth and employment,” ibid ., vol. 14 (1946), pp. 137 ff. P. M. Sweezy: 
op. cit. 

43 J. R. Commons: Legal Foundations of Capitalism (New York, 1924). 

44 A. G. Hart: Money , Debt and Economic Activity (New York, 1948). 

S. Perlman: A Theory of the Labor Movement (New York, 1928). 



14 


SCOPE AND METHOD OF ECONOMETRICS 


[1.2 


give us great insight into the nature of these phenomena and enable us 
to understand certain economic features of a given society. Econometrics 
cannot claim a monopoly as a method of economic research. 

Economics is a social science. 4 " It deals with a special aspect of society, 
i.e., the administration of scarce resources in order to satisfy wants. It 
is an empirical science like physics and biology and not an a priori science 
like logic or mathematics. Economic theorems can be verified by em¬ 
pirical data, at least in principle: in effect, econometrics is perhaps the 
outstanding method for such a verification. 

But economics has one characteristic in common with other social 
sciences, like, e.g., psychology. Apart from observation of external 
economic events, such as the price formation on a market, we have also 
an additional source of information about economic events: this is intro¬ 
spection. 4 ' Introspection is particularly helpful in connection with 
problems in the field of consumption. This source is perhaps not entirely 
reliable, but should by no means be neglected. However, the results of 

introspection should be carefully checked and compared with other 
economic observations. 48 

The pure economist may, for instance, by introspection find that the 
more chocolate he consumes the less is the enjoyment of an additional 
piece of chocolate. But before generalizing this into a universal law of 
diminishing marginal utility 40 (or, in more modern terms, of a diminishing 
marginal rate of substitution), 50 he should check his results carefully 
with economic observations performed on subjects other than himself. 
Market studies of the demand for certain commodities may be of great 

will also have some importance 
(see section 3.5). Econometric research may eventually give some numeri¬ 
cal estimates of the phenomenon in question. 

1.2 Econometrics and Statistics 

Statistics, especially modern statistical theory, is of paramount impor¬ 
tance for econometrics. Statistical methods of sampling are already 


O. Lange. The scope and method of economics,” Review of Economic 
Studies , vol. 13 (1945), pp. 19 fT. K. W. Rothschild: op. cit. y pp. 50 ff. 

47 F. Kaufmann: op. c/7., pp. 143 fT. 

48 P. A. Samuelson: op. c/7., pp. 21 fT. 

P. N. Rosenstein-Rodan: “Grenznutzen,” in Handwoerterbuch der Staats- 
wissenchaften (4th ed., Jena, 1927), vol. 4, pp. I 190 fT. L. Illy: Das Gesetz des 
Grenznutzens (Vienna, 1948). A. Mahr: Volkswirtschaftslehre (Vienna, 1948). 
O. Weinberger: Grundriss der Volkswirtschaftslehre (Vienna, 1937), pp. 37 fT. 

30 R A - Hicks, Value and Capital (2nd ed., Oxford, 1948), pp. 20 fT. 



1.2] 


ECONOMETRICS AND STATISTICS 


15 


useful for collecting the data which are the raw material of econometric 
studies. 1 Statistics has the additional advantage of giving us an idea 
about the reliability of the data. But the econometrician is not in general 
concerned with the collection of data and similar problem^ which are the 
job of the professional statistician. It might, however, be advantageous 
if more econometricians could gain influence over the wav in which 
statistical data are collected and presented. It is particularly unfortunate 
that so many of our basic data are still the by-product of administrative 
processes. 

Modern methods of statistical analysis are indispensable for the econo¬ 
metrician. The econometric relationships are alwavs of a statistical 
nature, as already indicated in the above example about the elasticity of 
demand for agricultural products (section 1.1). 

Modern statistical methods have been developed, especially by R. A. 
Fisher 2 and his school, 3 4 mostly for practical use in the biological sciences, 
where they have been very successful. The same general methodology 
can also frequently be used in the social sciences, especially in economics. 
But there is a difference between biological and social sciences: in econ¬ 
omics, as in astronomy and meteorology, no experimentation is possible. 1 
The two examples of astronomy and meteorology indicate that this pecu¬ 
liarity is not confined to the social sciences. The example of astronomy, 
one of the oldest and surely one of the most successful of the natural 
sciences, also shows that this difficulty should not prevent the emergence 
of a vast number of empirical laws, which are well confirmed by experience. 
However, the blind application of statistical methods which have proved 

useful in biology and agricultural experimentation is not possible in 
economics. 

Although this has been recognized by many econometricians for a 
long time, formerly they used not very reliable methods because no better 
procedures were available. But now, thanks to the effort of econo¬ 
metricians who saw clearly the fundamental problem, we have more 
promising methods especially designed for dealing with non-experimental 

1 See, e.g., A. J. King and R. J. Jessen: “The master sample of agriculture," 
Journal of the American Statistical Association , vol. 40 (1945), pp. 38 ff. W. E. 
Deming: Some Theory of Sampling (New York, 1950). 

2 R- Fisher: Statistical Methods for Research Workers (10th ed., Edin¬ 
burgh, 1946). 

See, e g., C. E. Weatherburn: A First Course in Mathematical Statistics 
(Cambridge, 1947). 

4 T. Haavelmo: “The probability approach in econometrics," Econometrica 
vol. 12 (1944), supplement, pp. 12 ff. D. Cochrane: “Measurement of econ¬ 
omic relationships," Economic Record , vol. 25 (1949), pp. 7 ff. 



16 


SCOPE AND METHOD OF ECONOMETRICS 


[1.2 


data. 5 It should be emphasized that these methods use the same funda¬ 
mental ideas of modern statistical theory as biological statistics, e.g., 
maximum likelihood, tests of hypotheses, fiducial or confidence limits 
(illustrated in section 1.1). No longer, however, are these methods used 
blindly and by mere analogy to statistical procedures which are useful 
and successful in other fields. These matters will be discussed in detail 
in section 6.5 and Chapter 7. 

Modern statistics is based upon the idea of probability. 6 There are 
a number of conflicting ideas about this concept, which is fundamental 
for all scientific methodology. Some authors hold that probability 
statements refer to propositions and are hence logical and not empirical. 
This concept refers to our rational degree of belief in a theory or hypothesis 
on the basis of empirical evidence. Two outstanding theorists who hold 
this view are Keynes 7 and Jeffreys. 8 Another school thinks that prob¬ 
ability refers to the outcome of frequently repeated experiments, as the 
number of trials increases. 9 

Carnap 10 has recently shown that both these concepts are legitimate 
and useful. The econometrician ought to use the first concept when 

5 J. Marschak: “Statistical inference in economics," in T. C. Koopmans, ed.: 
Statistical Inference in Dynamic Economic Models (New York, 1950), pp. 1 ff. 
T. C. Koopmans: “Statistical estimation of simultaneous economic relations,” 
Journal of the American Statistical Association , vol. 40 (1945), pp. 448 ff. E. J. 
Working: “What do statistical demand curves show?” Quarterly Journal of 
Economics , vol. 41 (1927), pp. 212 ff. 

6 E. Nagel: “Principles of the theory of probability,” International Encyclo¬ 
paedia of Unified Science , vol. I, no. 6 (Chicago, 1939). 

7 J. M. Keynes: A Treatise on Probability {Condon, 1921). 

8 H. Jeffreys: Theory of Probability (Cambridge, 1939). G. A. Barnard: 
“Statistical inference," Journal of the Royal Statistical Society , series B, vol. 1 I 
(1949), pp. I I 5 ff. 

9 R. von Mises: Probability, Statistics and Truth (London, 1939). W. Feller: 
An Introduction to Probability Theory and Its Applications, vol. I (New York, 
1950). R. von Mises: Wahrscheinlichkeitsrechnung (Vienna, 1931). J. Ney- 
man: First Course in Probability and Statistics (New York, 1950), pp. 15 ff. 

H. Reichenbach: Wahrscheinlichkeitslehre (Leiden, 1935). A. Kolmogoroff: 
Grundbegritfe der Wahrscheinlichkeitsrechung (Berlin, 1933). 

10 R. Carnap: “On inductive logic," Philosophy of Science, vol. 12 (1945), 
pp. 72 ff.; “The two concepts of probability,” Philosophy and Phenomenological 
Research, vol. 5 (1945), pp. 513 ff.; Logical Foundations of Probability (Chicago, 
1950). B. de Finetti: “La prevision ses logiques et sources subjectives," 
Annales de Vlnstitut Henri Poincare, vol. 7 (1937), pp. 1 ff. M. G. Kendall: 
“On the reconciliation of theories of probability," Biometrika , vol. 36 (1949), 
pp. 101 ff. 



1.2] 


ECONOMETRICS AND STATISTICS 


17 


talking about the probability of a theory or a hypothesis. For example, 
we might discuss the question whether on the basis of empirical evidence 
the Keynesian or the “classical" theory 11 is more probable. A theory 
of this probability concept, based upon the fundamental ideas of modern 
mathematical logic, has been developed by Carnap. He calls this concept 
the degree of confirmation. It must be confessed, however, that these 
ideas are not yet applicable to any except the simplest problems in the 
field of statistics, i.e., those dealing with the theory of attributes. The use 
of these ideas in econometrics has to await further developments of the 
theory. 12 

Another and entirely different probability concept refers to the relative 
frequency of an event, as the number of trials increases indefinitely. We 
may, for instance, throw a coin a great number of times and note the 
relative frequency of heads in each series of throws. As we get more 
and more experiments, the relative frequency of heads among the total 
number of throws will sometimes tend to a limit. This limit can, under 
certain conditions, be identified with the probability of obtaining a head 
with a throw of this particular coin. This concept of probability as the 
limit of relative frequency (defined, e.g., in the manner of von Mises) 
has to be distinguished from the first concept of probability. 

The econometrician may, for instance, consider the relative frequency 
of business failures, i.e., the percentage of businesses which fail each year. 
If he takes a larger and larger sample of business enterprises, he may talk 
about the probability of a business failure as the limit of the relative 
frequency of failures in a given sample, as the sample becomes larger and 
larger. It is evident that this second concept of probability has to be 
sharply distinguished from the first. 

Since the first probability concept is not yet useful for any except the 
simplest problems of statistical inference, we will follow statistical practice 
and use, for the time being, only the second concept. But we should 
bear in mind that the results reached in this way are not very satisfactory 
from a philosophical and epistemological point of view. 

The foundations of probability theory are disputed and will remain so 
lor some time to come. There is also not very much agreement about 
the methods to be used in statistical inference. These methods are of 
very great importance for econometrics. 


11 Ci. I miner: ‘ Static macro-economic models and their econometric veri¬ 
fication," Metroeconomica. vol. I (1949). pp. 48 IT. E. Fossati: “VillVedo 
Pareto and John Maynard Keynes," //>/>/., pp. 126 If. 

2 1 m tner: “Foundations of' probability and statistical inference," Journal 

oj the Royal Statistical Society , vol. I 12 (1949), pp. 251 fT. 



18 SCOPE AND METHOD OF ECONOMETRICS [1.2 

Statistical inference deals with the way in which conclusions about the 
population are drawn from the sample. Economic relationships may 
always be regarded as samples from an unknown infinite population 13 of 
all possible economic relationships. Econometricians use statistical 
methods in order to obtain numerical results or estimates. These esti¬ 
mates may consist of one single figure (point estimates) or of limits 
computed with a given probability (interval estimates). Econometricians 
also make use of statistical methods in order to test certain hypotheses 
about the unknown population. This procedure is useful in the testing 
and verification of economic laws. 

Modern statistical inference is based upon the idea of the random 
variable. This is a variable which can assume certain values with definite 
probabilities (probability is here understood as the limit of the relative 
frequency). A good example is the outcome of a throw with a die. This 
random variable can assume the values one to six with definite probabili¬ 
ties. With a true die, these probabilities will all be one-sixth. 

The first problem in statistical inference is point estimation. In this 
case we obtain a single figure for an estimate of the unknown quantity. 
The statistician uses here a number of methods, the two most important 
of which are the method of maximum likelihood 14 and the method of 
least squares. 10 The method of maximum likelihood chooses as the esti¬ 
mate the particular value which maximizes the probability density. The 
method of least squares chooses the value which minimizes the sum of 
the squares of the deviations from the chosen value. Both methods have 
desirable properties (sections 5.1 and 5.2) and lead frequently to the same 
estimates. There are also other methods available, but they are not in 
general use. 1H 

It is evident that estimation is of great importance for the econo¬ 
metrician. For instance, the estimate of—0.123 for the price elasticity 
of demand for agricultural products in the United States is derived by the 

13 A. M. Mood: Theory of Statistics (New York, 1950), pp. 126 ff. W. 
Feller: op. cir. y pp. 4 ff. 

14 M. G. Kendall: The Advanced Theory of Statistics , vol. 2 (London, 1946), 
pp. 1 ff. H. Cramer: Mathematical Methods of Statistics (Princeton, 1946), 
pp. 498 ff. G. Darmois: L'emploi des observations statistiques: methodes 
d'estimation. Actualites scientifiques et industrielles , 356 (Paris, 1936). 

15 F. N. David and J. Neyman: “Extension of the Markoff theorem on least 
squares," Statistical Research Memoirs , vol! 2 (London, 1938), pp. 105 ff. 

F. N. David: Probability Theory for Statistical Methods (Cambridge, 1949), 
pp. 161 ff. 

16 M. G. Kendall: The Advanced Theory of Statistics , vol. 2 (London, 1946), 
pp. 50 ff. 



1.2] 


ECONOMETRICS AND STATISTICS 


19 


method of maximum likelihood. It is ohosen in such a way that the 
probability density attains its maximum. 

Sometimes we desire more information than a single value for our 
estimate. The statistician computes what is called fiducial or confidence 
limits. 17 The theory of fiducial limits 18 and the theory of confidence 
limits are not identical, 19 and the computed limits do not always coincide. 
But we will neglect these difficulties for the moment. They do not arise 
in the simple cases we are going to discuss here. 

Fiducial or confidence limits are computed in such a fashion that the 
confidence coefficient is a preassigned number, e.g., 95 per cent. This 
has to be interpreted in the following sense: If a statistician computes a 
great many confidence or fiducial limits on the 95 per cent probability 
basis, then in the long run, on the average, these limits will enclose in 
95 per cent of the cases the true population value, whereas in 5 per cent 
of the cases the population value will fall outside the limits. 

These ideas can also be used in econometrics. The fiducial or confi¬ 
dence limits at the 95 per cent probability level are —0.052 and —0.195 
for the price elasticity of demand for agricultural products in the United 
States (section 6.5, Example 3). 20 

This statement should be interpreted in the following manner. We 
can say that an increase of 1 per cent of the price of agricultural products 
in the United States will, other things being equal, be accompanied by a 
decrease in the quantity of agricultural products demanded which is not 
less than about 1 / 20 of 1 per cent and not more than l / s of 1 per cent. 
These limits are computed with a confidence coefficient of 95 per cent 
of all cases. This is to say that the above statement about the limits and 
similar statements which use the 95 per cent confidence coefficient have 
a chance of being right in about 95 of 100 cases, in the long run, on the 
average. But it is necessary to emphasize that the validity of these 
procedures depends upon assumptions about normality and independence 
which may not be justified with economic data. 


17 M. G. Kendall: The Advanced Theory of Statistics, vol. 2 (London, 1946). 
pp. 62 ff.; A. M. Mood: op. cit ., pp. 223 ff. 

1H M. G. Kendall: The Advanced Theory of Statistics, vol. 2 (London 
1946), pp. 85 ff. 

19 Kendall: The Advanced Theory of Statistics, vol. 2 (London. 1946) 
pp. 269 ff H. Cramer: op. cit., pp. 525 ff. J. Neyman: “Outline of a theory 
of statistical estimation based on the classical theory of probability," Philo¬ 
sophical Transactions, series A, vol. 236 (1937), pp. 333 ff. 

20 G Tmtner: “Multiple regression for systems of equations/’ Econo,,netrica 

vol. 14 (1946), pp. 5 ff. 



20 


SCOPE AND METHOD OF ECONOMETRICS 


[1.2 


Another problem arising in statistical inference is the testing of statis¬ 
tical hypotheses. 21 A statistical hypothesis is, of course, not derived from 
the data. It is given independently of the statistical investigation, e.g., 
from considerations arising in economic theory. We follow here closely 
the theory of Neyman and Pearson. They distinguish two types of errors 
in testing a hypothesis: Type 1 error occurs if we reject a true hypothesis; 
and type 2 error, if we do not reject a false hypothesis. 

Tests of hypotheses should be designed in this way: For a given 
probability of type 1 error (called level of significance) we use the test 
which at the same time minimizes the type 2 error. Such tests do not 
always exist. 22 

Having constructed a lest in this manner, we proceed in the following 
fashion: We will choose a level of significance, e.g., 5 per cent. Then 
we will reject all hypotheses which have a probability of less than 5 per 
cent and will not reject those which have a higher probability. If this 
procedure is carried on for many tests, then on the average we will reject 
a true hypothesis in about 5 per cent of the cases. The test should also 
be constructed if possible in such a way that it rejects more false hypotheses 
than any other test with the same level of significance. 

If such a test does not exist we have to apply other, more complicated 
criteria for the choice of the optimum test among all tests with the same 
level of significance. 23 

To illustrate again the use of this particular type of statistical inference, 
let us test on the basis of our data the hypothesis that the unknown “true" 
elasticity of demand for agricultural products in the United States is —1 
in the population which corresponds to our sample. This hypothesis 
may be based on theoretical economic considerations. It says: To a 
given percentage increase in the price of agricultural commodities corre- 


21 M. G. Kendall: The Advanced Theory of Statistics. vol. 2 (London, 1946). 
pp. 269 ff. H. Cramer: op. cit.. pp. 525 ff. A. M. Mood: op. cit ., pp. 245 ff. 
J. Neyman and E. S. Pearson: “On the use of certain test criteria for the purpose 
of statistical inference,” Biometrika , vol. 20A (1928), pp. 175 ff., 263 ff.; “On 
the problem of the most efficient test of statistical hypotheses,” Philosophical 
Transactions. series A, vol. 231 (1933), pp. 289 ff. E. J. Gumbel: “Simple 
tests of given hypotheses,” Biometrika. vol. 32 (1942), pp. 317 ff. 

22 M. G. Kendall: The Advanced Theory of Statistics , vol. 2 (London, 1946), 
pp. 277 ff. 

23 M. G. Kendall: The Advanced Theory of Statistics, vol. 2 (London, 1946), 
pp. 3D7 ff. S. L. Isaacson: “On the theory of unbiased tests of simple hypo¬ 
theses specifying the values of two or more parameters,” Annals of Mathematical 
Statistics, vol. 22 (1951), pp. 217 ff. 



1.2] 


ECONOMETRICS AND STATISTICS 


21 


sponds, ceteris paribus , an equal percentage decrease in the quantity 
demanded. 

We use a test called Student's /-test 21 and the 5 per cent level of signifi¬ 
cance. We compute a quantity called /, which is as follows: The differ¬ 
ence between the empirical value (—0.123) and the hypothetical value 
(—1) is divided by its standard error, which is 0.0341901 (section 6.5, 
Example 3). We have: 



-0.123- (-1) 
0.0341901 


25.6507 


We have to look up our / in the tables of the /-distribution with 20 
degrees of freedom. These degrees of freedom have a definite relationship 
to the number of independent observations on which the statistical 
analysis is based. By chance we might expect positive or negative 
deviations as follows: At the 5 per cent level, / may be as large as 2.086. 
if the deviation is due to chance. The value of / may be positive or 
negative, because of the symmetry of the distribution. Actually our 
empirical / is more than ten times as large as the one permitted at the 
5 per cent level. We conclude that it is extremely unlikely that such 
a large deviation between the hypothesis and the empirical results should 
be due to chance. Hence, our hypothesis that the true elasticity is — 1 
has to he rejected, since the probability of the hypothesis is less than 5 
per cent, the level of significance. It should be emphasized that the test 
is valid only under certain assumptions, like normality and independence. 

A special and very important case of tests of hypotheses is tests of 
significance.' 1 ' These are constructed for the testing of a special hypo¬ 
thesis, called the null hypothesis, which says that the magnitude in question 
# 

is zero. Again we choose arbitrarily a level of significance, say 5 per cent. 
We will reject the null hypothesis if the probability that the divergence 
between the empirical results and the hypothesis is due to chance is less 
than 5 per cent, and will not reject it otherwise. If we reject the null 
hypothesis we will say that the quantity tested is significantly different 
from zero. 


Ihe meaning of the level of significance (5 per cent) is as follows: If 
we carry out many tests of significance, we may expect to reject in the 
long run, on the average, a null hypothesis in about 5 per cent of the 
eases, if the hypothesis is true. 


24 A. M. Mood: op. cit ., pp. 206 ff. 

G. W. Snedecor: Statistical Methods (4th ed., Ames, Iowa, 1946), pp. 43 ff 
T. Bancroft: “Probability values for common tests of significance,” Journal of 
the American Statistical Association , vol. 45 (1950), pp. 21 1 ff. 



22 


SCOPE AND METHOD OF ECONOMETRICS 


[ 1.2 


As an example, consider again the elasticity of demand for agricultural 

products in the United States. Is it significantly different from zero? 

The null hypothesis says now that the (unknown) elasticity of demand in 

the population, the “true” elasticity, is zero. Such a hypothesis may be 

suggested by an economist who believes that the demand for agricultural 

products is extremely inelastic, i.e. does not respond at all to price 
changes. 26 v 

We use again the /-distribution. We divide the empirical value of the 

elasticity (-0.123) by its standard error (0.0341901) (section 6.5, Example 
3). This gives: 



-0.123 
~~ 0.0341901 


-3.5975 


At the 5 per cent level of significance the permissible (positive or nega- 
tive) value of t is, for 20 degrees of freedom, 2.086. Our empirical 1 is 
larger. It is very unlikely that such a large deviation would have arisen 
by pure chance. Hence, the null hypothesis that the true value of the 
elasticity is zero has to be rejected. We say that the elasticity is signifi¬ 
cantly different from zero at the level of significance of 5 per cent. But 

this test depends again upon the validity of the assumptions indicated 
above. 

It should perhaps be mentioned that A. Wald 27 developed a most 
ingenious theory of statistical inference, which is a generalization of the 
Neyman-Pearson theory of testing hypotheses but includes also estimation 
and other problems. One of these problems which is probably of great 
importance for econometrics is multiple choice , i.e., the choice between 
several hypotheses. Multiple choice problems will be discussed in sections 
6.5, 8.1, and I 1.2.2. Wald s method is based upon the assumption of the 
existence of a risk-function which describes the consequences of com¬ 
mitting errors of various types. Since there is rarely agreement on the 
goals of social policy in economic matters, such an assumption does not 
seem to be very useful in our field as long as we employ the strictly indivi¬ 
dualistic approach. 28 But it is a most useful method, e.g., in industrial 


26 K. E. Boulding: Economic Analysis (rev. ed.. New York, 1948), pp. 129 fT. 
A. Wald: On the Principles of Statistical Inference (Notre Dame, Ind., 
1942); Statistical Decision Functions (New York. 1950). I. J. Good: Proba¬ 
bility and the Weighing of Evidence (New York, 1950). See also L. V. Tick: 

Past and Present Status of Statistical Inference (unpublished thesis, Ames, Iowa, 
1949). 

~ 8 K. A. Arrow: Social Choice and Individual Values (New York, 1951); 
The Possibility of a Universal Social Welfare Function (unpublished essay). 



1.2] 


ECONOMETRICS AND STATISTICS 


23 


applications of statistics, where the consequences of the various types of 
error can frequently be expressed in monetary terms. 

Carnap has developed certain ideas which seem more adequate for the 
construction of a pure, i.e., non-pragmatic, theory of statistical inference. 
This theory is, however, not yet complete, and most of the problems of 
practical interest to the economic statistician and econometrician are still 
out of its reach. Carnap's theory can deal only with attributes and not 
yet with variables which have continuous variation. Since most economic 
magnitudes belong in this last category (e.g., prices, quantities produced 
and consumed), we must await further developments of Carnap's theory 
before we may attempt to use it in econometrics. 


F. H. Knight: The Ethics oj Competition (New York, 1936), pp. 19 ffi, pp. 211 fT. 

K. A. Arrow: “A difficulty in the concept of social welfare," Journal of Political 
Economy , vol. 58 (1950), pp. 328 ff. 



Chapter 2 


A Short Sketch of Regression Methods 


Econometric research uses frequently a particular statistical method 
called regression analysis. We will concern ourselves at first with linear 
regression. 1 Multiple regression will be treated in section 5.1, non-linear 
regression in section 8.1. The problem is as follows: 

Assume that there is a random variable Y. This variable Y is connected 
with another variable A' by a linear relationship: 

(1) Y = a -f fiX 

This is the relationship which is supposed to hold in a (hypothetical) 
infinite population: a is the population regression constant, and ft the 
population regression coefficient of Y on X. It is the purpose of regression 
analysis to predict the values of the variable Y if we know the value of 
the variable X. We also desire to estimate the parameters a and ft. 

We observe N values of X: X x , X 2 , • • • A' v , and N values of Y : 
Y i, T 2 , • • • Y y . These 2 N observations constitute our sample. Let 
Y x be associated with X x , Y 2 with X 2 , • • • K iV with X x . Let a be the 
empirical regression constant and b the empirical regression coefficient. 
Then we will almost always experience the following fact: The empirical 
regression equation: 

(2) Y'=a + bX 

will not fit exactly. Y' is here the estimated or fitted value of Y. Certain 
errors or deviations will appear, which we denote by r x> r 2 , • • • e#. 
These are as follows: 

Y x - Y x Y x - a - bX x - ^ 

Y. t — Yo == Y. y — a- bXo = e* 

M tea te MM 

(3) 



1 M. Ezekiel: Methods of Correlation Analysis (2nd ed.. New York, 1941). 
A. M. Mood: Theory of Statistics (New York, 1950), pp. 291 ff. M. G. 

24 




2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


25 


We assume now that these eirors e l9 e 2 > * * * £\ are independent. That 
is to say, the distribution of any particular deviation is independent of 
the distribution of all other errors. We assume also that these errors 
are normally distributed, with population mean zero and population 
variance a 2 . The population variance is a measure of the dispersion of 
the errors in the population. It should be emphasized that the 
assumptions of normality and independence will not always hold true 
with economic data. 

The distribution of the errors or deviations is, under these assumptions, 
as follows: 




1 

-—— e 



for the probability density that the first error will have a magnitude e 1 ; 
t t = 3.1415927 is here the ratio of the circumference of a circle to its 
diameter; e = 2.71828 is the basis of the so-called natural system of 
logarithms. Similarly we get for the second deviation f 2 : 



P(e 2 ) 


1 

—-=— e 

(\ 2tt)o 



and so on, until we have for the last error or deviation: 




1 

- —c 

(v 2tt)g 


_L £ ? 

2o* - v 


Since by assumption the probability of each error is independent of 
the distribution of all others, the probability that a certain set of errors 
or deviations e l9 e 2 , • • • e s will occur together is given by their product. 
This follows from the well-known proposition in probability theory: 
The probability that two or more independent events will happen together 
is the product of their separate probabilities: 2 




( 2tt) 


Y 9 \ 

* '“cr 


e 


2 o* 


(V + * 


2 

7 



This is the probability that the errors or deviations e l9 t 2 , • • • t v will 
occur together. 

It is convenient to use here the notation of summation. Suppose that 


Kendall: “Regression, structure, and functional relationship,” Biometrika , 
vol. 38 (1951), pp. 11 ff. 

2 A. M. Mood: op. c/7., pp. 29 ff. 



26 


A SHORT SKETCH OF REGRESSION METHODS 


[2.1 


we have TV quantities: z l9 z 2 , z, v , and we want to find their sum. 
Then we will denote quite generally: 

V 

( 8 ) z \ + *2 ■+■ * • * + r v = 2 z t 

i - 1 


In this notation the probability density P' can be written more con¬ 
cisely: 




1 


(2tt) 


A/2 cr v 


2a* 


S 
D 
i= 1 



Substituting from the previous set of equations (3), which define the 
errors f„ c 2 , ■ ■ • e v> we see that the differences Y l — a — bX „ Y 2 — a — 

bX 2 , • ■ ■ Y s — a — bX x are also normally and independently distributed. 
Their probability density is: 


( 10 ) 



1 


(2ttV v 'V v 




where we abbreviate: 


Q = (Y l -a-bX l f + {Y 2 -a- bX 2 ) 2 + ■ ■ ■ + ( Y x - a-bX y ) 2 

(II) A’ 

= I(r,-a-bx,.y 

. = l 


The observations Y l9 T 2 , • • • T v , X ly X 2 , • * • A'y are given empirical 
data. We can, however, dispose of the constants a and b. We may, 
for instance, use the method of maximum likelihood , which has a number 
of desirable properties (section 5.2). If we do this, we will choose the 
estimates of the constants a and p in the population, which we designate 
by a and 6, in such a way that the probability density P' [formula (9)] 
becomes a maximum. The sample is representative of the population, 
and we make the probability of the sample as large as possible. 

We can rewrite the probability density P' [formula (9)] also in the 
following way: 

(12) P' =-!-_ 

(2tt) w/ VV"* 0 

We note that P' is a fraction. The parameters a and b which are to 
be estimated appear only in the quantity Q [formula (1 I)]. This quantity 
is in the denominator of P' in formula (12). A fraction is the larger, 
the smaller its denominator. Evidently the probability density P' will 
be the larger, the smaller Q is. Hence we have reduced the problem to 
finding the smallest possible value of Q. 

It is apparent that Q is a sum of squares. It is indeed the sum of the 



2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


21 


squares of the deviations of the observed values of Y ( Y ly Y 2 , • • • T N ) 
from the “fitted" values: a + b A',, a 4- bX 2% • • • a -F bX lS . The best 
fit in the sense of the method of maximum likelihood will be achieved 
in'this case by an application of the method of least squares (section 5.1). 
We have to minimize the sum of squares Q in order to achieve the maxi¬ 
mum likelihood estimates of our parameters x and fi. 

It should be mentioned that we could have used the method of least 
squares under less stringent assumptions than the ones made above about 
the distribution of the errors. If we only make the assumption that the 
errors or deviations e 2 , * * * t \ are independently distributed, without 
assuming that the distributions are normal, we can use the method of 
least squares to achieve the least squares estimates of the parameters. 
The Markoff 3 theorem (section 5.1) assures us that these estimates will, 
even under these much less stringent assumptions, have certain desirable 
properties. It has been shown that even the assumption of independence 
may be dropped. This theory is discussed in section 10.5. 

To minimize Q we differentiate it partially with respect to a and b 
and set the partial derivatives equal to zero. The sufficient conditions 
of a maximum are also fulfilled. The least squares estimates of a and /?, 
namely, a * and />*, are the solutions of the two normal equations: 

.v .v 

Na* + ( 7 X,)h* =- 2 Y , 

i= I i - 1 

(13) 

( v X,)a* + ( 2 X 2 )b* - 2 X, Y, 

1=1 i - I i~ I 

Here /V is the number of observations. The sums are easily computed 
from the data, especially with the help of modern calculating machines. 

Example I. The following example illustrates the computation of the 
sums: 



X 

Y 

TABLE 

X 2 

1 

X Y 

Y 2 


1 

4 

1 

4 

16 


3 

2 

9 

6 

4 


0 

2 

0 

0 

4 


4 

4 

16 

16 

16 

Sums: 

8 

12 

26 

26 

40 


1 A. A. Markoff: Wahrscheinlichkeitslehre (Leipzig, 1912). F. N. David 
and J. Neyman: “Extension of the Markoff theorem on least squares." Statis¬ 
tical Research Memoirs , vol. 2 (London, 1938), pp. 105 ff. F. N. David: 
Probability Theory for Statistical Methods (Cambridge. 1949), pp. 161 ff. 



28 


A SHORT SKETCH OF REGRESSION METHODS 


[2.1 


We have four observations; hence N = 4. From Table 1 we have 
the sums: SAT = 8, SK= 12, SA' 2 = 26, SA Y= 26, 2 T 2 = 40. 

The normal equations (13) are in this case: 


(14) 


4a* + 8 b* = 12 
8a* + 26 b* = 26 


This simple system of two linear equations can easily be solved. It gives: 

a* = 2.6, b* = 0.2 

The equation 

(15) y* = a * + b*X 


provides the estimate of Y if X is given, under the assumptions stated. 
Y* is here the least squares estimate of Y. It is in our case: 

(16) Y* = 2.6 -f 0.2X 


Suppose we know that X = 2. Then the estimate for Y under the 
conditions stated is Y* = 2.6 4* (0.2) (2) — 3. Suppose, on the other 
hand, that wc know that X = 10. Then our best estimate for Y is, under 
the conditions stated above, Y* = 2.6 + (0.2) ( 10 ) - 4 . 6 , etc. 

The computed value of a * is also the estimate of the regression constant 
a in the population, and the computed value of b* is the estimate of the 
regression coefficient ft of Y on X in the population, if either of the 
following conditions is fulfilled: 

(a) The errors and deviations e lt f 2 , • • • e A are errors in the equation. 
This is to say, they are the result of certain variables which influence Y 
(apart from X), but which have not been included in the equation. For 
instance, if Y is the consumption of beef and X is the price of beef, then 
income may be such a variable. The prices of pork, mutton, chicken, 
etc., should also be included. If all these variables are neglected and 
not included in the regression equation, they will produce deviations 
which will be errors in the equation. Errors in the equations are discussed 
in Chapter 7. 

( b ) The errors are errors in the variables. This type of errors is similar 
to errors of observations and results from certain deficiencies in the series 
of data used in the analysis. Then they must occur only in Y and not 
in X. To return to our previous example, a* and b* will be valid 
estimates of the parameters in the population regression equation only if 
errors occur just in the variable Y (consumption of beef), the variable X 
(price of beef) being measured without errors. The errors in the variables 
are similar to the errors of observation in the natural sciences. Their 
analysis is discussed in section 6.5. 



2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


29 


If we assume again normality and independence of the errors or devia¬ 
tions, we can compute a quantity which is useful for testing hypotheses, 
tests of significance, and the establishing of fiducial or confidence limits 
for the regression coefficient b*: 

(b* — fi) 

(17) 


/ = 


s 


Here b* is the least squares regression coefficient, fi is the (hypothetical) 
population regression coefficient, and s is the so-called standard error of 
the regression coefficient. The square of s is given by: 


yv y y 2 

i -1 


A’ 


(i y,) 2 - 

i — I 


[NiXiYi-dxMi y.)Y 2 

i- I 


i— I 


i = 1 


(18) 


2 


n v x 2 - ( v x,) 2 

i — 1 i'=l 



.Y A 


(<v- 2 ) 

.v 2 X? - ( 2 x ,) 2 

. i=i i=i 



This quantity / follows, under the circumstances assumed, the so-called 
/-distribution with N- 2 degrees of freedom. The number of degrees 
of freedom is two less than the number of observations (.V), because two 
constants (a* and b*) have been estimated from the data. 

The quantity / measures the deviation between the empirical value h* 
and the hypothetical value fi on an appropriate scale. 

For the data in our example we obtain from formula (18): 


(4) (40)-(12) 


2 


/ [(4) (26) — (8) (12)] 2 




(19) 


(4) (26) - (8) 


2 


\ 


(4 2)[(4) (26) — (8) a ] 


0.424 


Hence the quantity: 


( 20 ) 


, ( 0.2 

0.424 


fi) (2.36) 


follows the/-distribution w ith 4 2 = 2 degrees of freedom. The number 

of degrees of freedom is the number of observations (4) minus the number 
of estimated constants (2). 

By making fi — 0 we may test the null hypothesis that in the population 
the regression coefficient is zero. This will be the case if there is no 
(linear) relationship between Y and X in the population. This test is 
called a test of significance. The distribution of / is tabulated: If, from 
the point of view of a given level of significance, we reject the null hypo¬ 
thesis, then we will say that b* is significant, i.e., it is significantly different 
from zero. If w'e do not reject the null hypothesis, then we will say that 



30 


A SHORT SKETCH OF REGRESSION METHODS 


[2.1 


b* is not significant. That is to say, in the second case there is, in all 
probability, no linear relationship between Y and X in the population. 

Let us fix a level of significance of 5 per cent for the example just dis¬ 
cussed. We want to test the significance of the empirical regression 
coefficient b* = 0.2; i.e., we want to test the null hypothesis that the 
corresponding population regression coefficient fi is zero. 

Putting fi = 0 in the above formula (20), we have t = (0.2) (2.36) = 0.47. 

But our tables tell us that at the 5 per cent level of significance we may 

expect a positive or negative t which may for 2 degrees of freedom be as 

large as 4.303. Hence there is no reason to reject the hypothesis. We 

say that our empirical regression coefficient is not significant. It is so 

small that there is a great likelihood that in the population from which 

our sample is taken there is indeed no linear relationship between X 
and Y. 

We may also use the quantity t for tests of hypotheses. Assume that 
we make the hypothesis that in the population fi has a specific value, 
say fi'. Then we will substitute ft' into formula (20) and use the tabulated 
values of t for the correct number of degrees of freedom (N — 2) for the 
test of the hypothesis that in the population fi = fi'. If the probability 
is less than a previously assigned probability level, say 5 per cent, then we 
will reject the hypothesis. If the value is greater, we will not reject it. 

Let us return to our numerical example. Assume that the hypothesis 
says that in the population the regression coefficient is fi' = 4. We use 
again the level of significance of 5 per cent. This is to say, we agree to 
reject a hypothesis which is such that, assuming it to be true, deviations 
between the hypothetical value fi' = 4 and the empirical value h* = 0.2 
in our sample could have arisen by chance alone in more than 5 per cent 
of the cases. 

Our t is, in this case, / = (0.2 — 4) (2.36) — —8.97. But the maximum 
positive or negative t for 2 degrees of freedom which could arise by chance 
is 4.303, assuming the 5 per cent level of significance. Our empirical t is 
— 8.97; i.e., it is about twice as large. Hence its probability to have 
arisen by chance is very small, certainly less than 5 per cent if the hypo¬ 
thesis is true. The hypothesis that in the population the true regression 
coefficient is ft — 4 has to be rejected. 

By choosing an appropriate level of probability (called confidence or 
fiducial coefficient) we can also establish fiducial or confidence limits for 
our estimate b*. These are chosen in such a way that the probability is, 
say, 95 per cent (confidence coefficient) that the limits so computed will 
enclose the true or population value of fi. 

Let us try to establish the 95 per cent confidence or fiducial limits for 
our empirical regression coefficient. We have to find two numbers which 



2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


31 


are such that these limits will have a 95 per cent char.ce to enclose the 
true population regression coefficient (3. 

Using methods similar to preceding ones, we get for these limits the 
equation: 

(21) (0.2 — ft) (2.36) - -4.303 


The value of / at the 5 per cent level of significance is 4.303 for 2 degrees 
of freedom and may be positive or negative, since the distribution is 
symmetrical. Solving equation (21). we get, for the limits at the 95 per 
cent probability level, 2.02 and —1.62. 

These limits have the following property: Assume that many limits 
are computed with the 95 per cent coefficient. Then in the long run, on 
the average, the true population value will fall between them in about 
95 per cent of all cases and outside in about 5 per cent of all cases. 

Simple regression analysis as presented above is subject to certain 
limitations. 

( a) It yields essentially the least squares estimate of Y, i.e., T*, which 
can be computed by assuming the value of X as known. Hence it is 
suited for prediction. By exchanging the role of X and Y in our analysis 
we can derive another regression equation - 


(22) X' = A + BY 

This is the regression of X on Y. It is now best suited for predicting 
X if Y is known. 

The first regression equation (15) results from an effort to minimize 
the sums of squares of deviations from the regression line measured 
vertically (in the T-direction). The second regression equation (22) is 
to be chosen if we minimize the sum of squares of the deviations measured 
horizontally (in the A-direction). The “true' or population regression 
equation will lie between the two empirical equations if there are errors 
in both variables. 4 This problem will be discussed in section 6.5. 

{b) The estimates a* and 6* are estimates of the parameters a and 
[I in the population only if errors or deviations occur in the variables 
K,, Y 2 , • • • K v , but not in the variables .V,, X 2 , • • • A\. 

(r) The relationship between Y and X must be linear. If we, however, 

g Y fc r Y in our analysis, we get an approxima¬ 

tion for an exponential relationship. If we substitute log X for X and 
log Y for Y, we get an approximate estimate for a parabolic or hyperbolic 


1 G. Tintner: “An application of the variate difference method to multiple 
regression." Economctrica, vol. 12 (1944). pp. 97 ff. 



32 A SHORT SKETCH OF REGRESSION METHODS [2.1 

relationship, etc. This last method is especially useful in econometric 
research. It will be discussed in section 3.4. 

(d) Tests of significance, tests of hypotheses, and the computation of 
fiducial or confidence limits are subject to the assumption that the “errors" 
e ly e 2 , • • • f v are normally and independently distributed. There is some 
reason to believe that small deviations from normality will not seriously 
affect the results. 5 But, if the errors are not independent, i.e., if the 
distribution of a given deviation is not independent of the distribution 
of all the other errors, then the consequences are more serious. The 
number of degrees of freedom is seriously affected, since this number is 
related to the number of independent observations. This subject is 
discussed in Part 3 of this book (section 10.1.3). 

Simple linear regression as described above may be generalized to non¬ 
linear regression. We assume, e.g., that the relationship between Y and 
X is of the form of a polynomial, and the regression equation becomes: 

< 23) Y' = a + b x X+ b 2 X 2 + b 3 X 3 + • • • + b p X> 

We get again a set of normal equations if we use the method of least 
squares, and can also establish the distributions of the regression coeffi¬ 
cients />!*, b 2 * y • • • b v * computed from the normal equations. These 
are the least squares estimates and under certain conditions estimates of 
the polynomial relationship in the population. Tests of significance, 
tests of hypotheses, and confidence or fiducial limits can be established 
in a manner similar to that indicated above. The fitting of orthogonal 
polynomials is discussed in section 8.1. 

In econometrics we frequently have to make the assumption tnat there 
is a linear relationship between more than two variables. For instance, 
in demand analysis, we may want to explain consumption by a knowledge 
of price and income. Let the relationship in the population be: 

(24) Z = y.+fiX+yY 

Then we can assume a multiple regression equation: 

(25) Z' = a -f bX -f- c Y 

Proceeding in a way similar to that used in the case of simple regression, 
we get again a set of normal equations. These give the least squares 
estimate of Z, say Z*, if the values of X and Y are given. Under certain 

5 R C. Geary: “The distribution of Student's ratio for non-normal samples," 
Journal of the Royal Statistical Society , supplement, vol. 3 (1936), pp. 178 ff. 

M. S. Bartlett: “The effect of non-normality on the /-distribution," Proceedings 
of the Cambridge Philosophical Society , vol. 31 (1935), pp. 223 ff. 



2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


33 


circumstances they will also give estimates of the linear relationship 
existing in the population between the three variables. M ultiple regression 
is discussed in section 5.1. 

Tests of significance, tests of hypotheses, and confidence or fidicual 
limits can again be established in a way similar to that used before. 

Multiple regression methods have been employed very extensively in 
econometric research. There are, however, certain difficulties connected 

with their use. 


(a) Multicollinearity. If there is a very close relationship between two 
determining variables, say between X and Y in our example, formula (25), 
then it may not be possible to find the individual regression coefficients 
with sufficient accuracy. The regression equation remains valid for 
prediction, i.e., for estimating Z if X and Y are given. 

For instance, Z may be the demand for coffee, X the price of coffee, 
and Y the national income in a coffee-producing country, e.g., Brazil. 
Then there will in all probability be a very close relationship between X 
and F, since most of the national income is derived from the sale of 
coffee. Under these circumstances it is not possible to determine accu¬ 
rately the linear relation between X , V, and Z. 

Problems of this nature have been investigated by Frisch/’ who developed 
the method of bunch map analysis to deal with this situation. This is 
essentially a graphical method and leaves a great deal to the personal 
judgment of the statistician. The author 7 has developed another method, 
which is applicable if there are errors in the variables and their relative 
magnitude is known. This method is presented in section 6.5. 

(b) Identification. The data which are the basis of our analysis will 
actually be the result of the interaction of several economic relationships/ 


f> R. Frisch : Statistical Confluence Analysis by Means of Complete Regression 
Systems (Oslo, 1934). R. Frisch and B. D. Mudgett: “Statistical correlation 
and the theory of cluster types," Journal of the American Statistical Association , 
vol. 26 (1931), pp. 275 ff. R. Stone: The Role of Measurement in Economics 

(Cambridge, 1951), pp. 7 ff. 

7 G. Tintner: “A note on rank, multicollinearity and multiple regression." 
Annals of Mathematical Statistics , vol. 16 (1945), pp. 304 ff. 

8 T. Haavelmo: “The probability approach in econometrics," Econometrica , 
vol. 12 (1944), supplement. T. C. Koopmans: “Identification problems in 
economic model construction," ibid., vol. 17(1949),pp. 1250. S. Morris Living¬ 
ston, A. Smithies, J. L. Mosak: “Forecasting postwar demand," ibid., vol. 13 
(1945), pp. I fT. L. R. Klein: “A postmortem on transition predictions of 
national income," Journal of Political Economy , vol. 54 (1946), pp. 289 ff. G. 
Tintner: “Die Identifikation: ein Problem der Oekonometrie," Statistische 
Yierteljalirsschrift , vol. 3 (1950), pp. 68 ff. 



34 


A SHORT SKETCH OF REGRESSION METHODS 


[ 2.1 


Take the simplest case of the quantity sold and the price established on 
an isolated market as an example. They are the result of the interaction 
of a demand equation and a supply equation. It is by no means certain 
which equation will be revealed by a simple least squares reeression 
analysis of the quantities on the prices or of the prices on the quantities. 

In this very simple case we may reach the following conclusions: 9 

(1) Let us assume that the (per capita) demand function is unchanged 
over time. This is probably the case with many agricultural commodities, 
since consumption habits change very slowly. On the other hand, the 
supply function shifts very violently (e.g., because of changing weather 
conditions which produce good and bad harvests). Then we can approxi¬ 
mate by simple regression analysis the demand function, but not the 
supply function. 

(2) Assume that the supply function is stable over time, e.g., because 
cost conditions which are fundamental for the supply of the commodity 
change very slowly in an old and very well-established industry where 
there is little technological change. This may be the case, for instance, 
in the cement industry. On the other hand, the demand function fluc¬ 
tuates very violently. This may, for instance, be due to the fact that 
cement is used for building and very little building takes place during 
a slump, but there is much construction during a boom. Under these 
assumptions it is possible to derive statistically the supply function but 
not the demand function. 

(3) Assume that the demand and the supply function fluctuate. This 
is, of course, the most common case. Then a regression analysis will 
not approximate the demand or the supply function but a mixture of both. 

It is then necessary to introduce additional variables. We may, for 
instance, assume that the demand function depends upon income and 
price, and the supply function upon a cost factor (wages) and price. 10 
If the two new variables are not too closely related, we can derive the 
demand and the supply functions by methods of multiple regression. 

(4) Another way out is open with certain agricultural commodities 
which have a fixed period of production. This is, for instance, the case 
in the production of pigs. Then we get what is known in economics as 
the cobweb phenomenon. 11 The quantity demanded depends upon the 


9 H. Schultz: Theory and Measurement ofDemand (Chicago, 1938), pp. 72 ff. 

10 T. C. Koopmans: “Statistical estimation of simultaneous economic rela¬ 
tions," Journal of the American Statistical Association , vol. 40 (1945), pp. 448 ff. 
G. Tintner: “Multiple regression for systems of equations," Econometrica , 
vol. 14 (1946), pp. 5 ff. 

11 M. Ezekiel: “The cobweb theorem." Readings in Business Cycle Theory 



2 . 1 ] 


A SHORT SKETCH OF REGRESSION METHODS 


35 


present price, and the quantity supplied upon the price which prevailed 
when production was started. A simple regression of the quantity with 
the contemporary price should give us an approximation of the demand 
function. A lag correlation of the quantity with the price lagged by the 
period of production should give an estimate of the supply function. 
Such procedures are discussed in section 3.2. 

Problems of identification will be discussed in section 6.5 and in Chapter 
7. The use of lags is discussed in section 10.3.5. 


(Philadelphia, 1945), pp. 422 ff. See also A. Hanau: “Die Prognose der 

Schweinepreise/' Vierteljahrsheftc zur Konjunkturforschung (Sonderheft 18 
Berlin. 1930). 



Chapter 3 

!«■*■■■■ m hm■mm■^ hhmm mi hm ^^^ mmhh^ m h ^^^ h ^ h ^^ hhh|bb ^^ h|hhm|| 

Some Illustrations of Econometric Research 


We will here present some examples of the application of econometric 
methods to economic problems. We will not deal with mathematical 
economics or statistical economics, however closely related to econo¬ 
metrics. Both mathematical economics and statistical economics have 
actually covered much wider fields than econometrics proper. We will 
consider only econometrics, i.e., examples of the effort to evaluate numeri¬ 
cally and to verify statistically economic laws or theorems. We will 
illustrate each particular type of econometric research by some examples, 
in order to emphasize the meaning of the various attempts to find numeri¬ 
cal economic laws. We will also stress possible applications to economic 
policy. 

Statistical methodology as applied to econometrics has progressed 
considerably during the last few years. Hence many of the older econo¬ 
metric investigations are nowadays obsolete, since they were carried out 
with the use of statistical methods which cannot be considered adequate. 
They are still presented as illustrations, since they show the possible range 
of applications of econometric methods. Parts 2 and 3 of the present 
book are devoted to the development and discussion of more adequate 
statistical procedures which ought to be used in econometric research. 

Instead of giving a historical account we will treat some of the main 
problems to which econometrics has been applied. We will discuss 
examples of the application of econometric methods in the following 
fields: (a) derivation of demand functions (section 3.1); (b) supply 
functions (section 3.2); (e) cost functions (section 3.3); (cl) production 
functions (section 3.4); (e) utility and related functions (section 3.5); 

( J ) tableau cconomique (section 3.6); (#) static models of the total economy 
(section 3.7); (/;) dynamic models of the economy (section 3.8). 

3.1 Demand Functions 

This is the field in which the first applications of econometrics were 
made; we refer to the bold attempt of the English economist Gregory 
King 1 in the seventeenth century to formulate numerically a relationship 
between a defect in the harvest and the corresponding increase in the 

36 



3.1] 


DEMAND FUNCTIONS 


37 


price of corn. The next effort came from a pioneer of modern econo¬ 
metrics, the American economist H. L. Moore. 2 He attempted to esti¬ 
mate in his book Economic Cycles in 1914 price elasticities for the demand 
for corn, hay, oats, and potatoes. Pigou, 3 Lehfeld, 4 and others were also 
pioneers in the field. Leontief 5 contributed a very interesting variant, 
and Frisch ,: a penetrating criticism which was influential for later work 
in the field. 

The main work dealing with the statistical derivation of demand func¬ 
tions is, however, due to the late Professor Henry Schultz of the University 
of Chicago. His Statistical Laws of Demand and Supply with Special 
Application to Sugar is still worth-while reading. His life work, published 
shortly before his untimely death, is The Theory and Measurement of 
Demand , 8 which is the main source of methodology in this field, even if 
some of the statistical methods used may now be considered as superseded. 


1 H Hl gg* ed.: Pa (grave's Dictionary of Political Economy. vol. 2 (London. 
1923), pp. 505 ff., article on Charles Daven and Gregory King. W. S. Jevons: 
Theory of Political Economy (4th ed., 1871), pp. 152 IT. 

2 H L Moore: Economic Cycles: Their Law and Their Cause (New York. 
1914), Forecasting the Yield and Price of Cotton (New York, 1917). 

'* A. C. Pigou: “A method of determining the numerical value of elasticities 
of demand," Economic Journal. vol. 20 (1910), pp. 636 If.; Economics of Welfare 
(London, 1920), app. II; “The statistical derivation of demand curves " Econ¬ 
omic Journal. vol. 40 (1930), pp. 384 fT.; “Marginal utility of money and elas¬ 
ticities of demand. Quarterly Journal of Economics. vol. 50 (1935), pp. 532 ff 

M. Friedman: “Professor Pigou's method of measuring elasticities of demand 
from budgetary data," ibid., vol. 50 (1935), pp. 151 ff. 

4 R. A. Lehfeld: “The elasticity of the demand for wheat," Economic Journal 
vol. 24 (1914), pp. 212 ff. 

5 W W. Leontief: “Ein Versuch zur statistischen Analyse von Angebot und 
Nachfrage,” Weltwirtschaftliches Archiv. vol. 30, pp. I ff. 

R. Frisch: Pitfalls in the Statistical Construction of Demand and Supply 
Curves (Leipzig, 1933). • 

7 H. Schultz: Statistical Laws of Demand and Supply with Special Application 
to Sugar (Chicago, 1928). ' r 

H. Schultz: The Theory and Measurement of Demand (Chicago, 1938). 
See also R. Roy: Contributions aux recherches econometriques. Actualites 
saentifiques cl intluslrielles, 412 (Paris, 1936). V. Rouquet La Garrigue- Lcs 
problemes tic la correlation el tie I'elasticite etude theorique a,Hour tie la hi tie 
K "W' vols. 1,2. Actualites scientifiqucs el industrielles, 1039, 1043 (Paris 1948) 
CL S. Shepherd: Agricultural Price Analysis (3rd ed., Ames, Iowa 1950) pp 
5- n. E. W. Gilboy: ‘ Methods of measuring demand or consumption,” 



38 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[ 3.1 


We take from this magnum opus some illustrations to show both the 
achievements and the difficulties of the econometric analysis of demand. 

Henry Schultz estimated in his book demand functions for the following 
commodities: sugar, corn, cotton, hay, wheat, potatoes, oats, barley, 
rye, buckwheat. He also investigated the interrelations of the demands 
for sugar, tea, and coffee, the interrelations of the demands for barley, 
corn, hay, and oats, and the interrelations of the demands for beef, pork, 
and mutton. 

Example 1. Schultz derived the statistical demand function for wheat 
in the United States 9 in the following form: The data are annual figures, 
1921-34. Let x be the quantity of utilized wheat less seed in bushels 
per capita. Let p be the deflated farm price of wheat in cents per bushel. 
The deflator is the U.S. index of wholesale prices, computed by the 
Bureau of Labor Statistics, 1913 = 100; t is time, origin 1928. 

By a straightforward least squares multiple regression analysis Schultz 
derived for wheat the following demand function in logarithmic form: 

(1) log x = 1.0802 - 0.2143 log p - 0.00358r - 0.00163 1 2 

The last two terms in formula (1) describe a shift in the demand for 
wheat during the period. The regression coefficient of log p is significant 
but those of t and r 2 are not. This indicates that there has been probably 
no significant shift of the demand function for wheat during the period 
analyzed. 

The above equation gives us immediately an estimate of the price 
elasticity of wheat. It is —0.2143. Other things being equal, if the 
price of wheat increases by 1 per cent, then the demand for wheat decreases 
by a little more than 1 / 5 of 1 per cent. 

In order to derive fiducial or confidence limits for the elasticity of wheat 
we note that our period covers 14 years. The number of degrees of 
freedom is 10; the standard error of the price elasticity is 0.0398. Hence, 
the 5 per cent fiducial or confidence limits for the price elasticity of 
wheat are —0.3050 and —0.1236. 

This can be interpreted in the following way: An increase of 1 per cent 
in the price of wheat will probably bring about a decrease in the quantity 


Review of Economic Statistics , vol. 21 (1939), pp. 69 ff. W. Winkler: Grund- 
fragen der Oekonometrie (Vienna, 1951), pp. 115 ff. L. H. Bean and G. B. 
Thorne: “The use of trend residuals in constructing demand curves," Journal 
of the American Statistical Association , vol. 127 (1932), pp. 61 ff. 

9 H. Schultz: The Theory and Measurement of Demand (Chicago, 1938), 
pp. 361 ff. 



3.1] 


DEMAND FUNCTIONS 


39 


demanded by not less than about l / 8 of I per cent and not more than about 
3 / 10 of 1 per cent. 

The theoretical and practical importance of such results, if they are 
somewhat reliable, is apparent. The elasticity of the demand for wheat 
with respect to the price is relatively small. This is a result which has 
long been suspected by the theoretical economists, but it is important to 
have it confirmed by statistical analysis. It is also perhaps preferable to 

have a numerical estimate of the elasticity instead of talking about low 
elasticity in general. 

We will illustrate a possible use of these results in economic policy by 

the following example: Assume that conditions are approximately the 

same in the United States as in the period considered. Let the government 

contemplate an increase in the price of wheat by 10 per cent. This may 

be accomplished, for instance, by price fixing. Then, if our results hold 

true, the consumption of wheat in the United States can be expected to 

decrease by about 2 per cent. Hence the total receipts of farmers from 

the sale of wheat will increase by about 8 per cent as the result of such 
a policy. 

Equation (I) has been derived by the use of the classical method of least 
squares. Hence there are a number of possible errors in the analysis. 
(a) The identification problem (section 6.5 and Chapter 7) has not been 
faced, (b) There is no consideration of the possible effects of multi- 
collinearity (section 6.5). (c) The interdependence of successive observa¬ 
tions has not been considered (section 10.5). It would also have been 
better to consider the demand function within the framework of a system 
of equations (section 3.7). 10 There is also no consideration of the mutual 
interdependence of consecutive observations (Part 3). All these diffi¬ 
culties make the validity of the results somewhat doubtful. 

Example 2. To give an example of a more complex analysis, we use 
Schultz’ investigation of the interrelations of the demands for beef and 
Pork . 11 The data are as follows: total consumption of federally inspected 
beef and veal in millions of pounds ( x b ); composite retail price of beef 
in cents per pound (>•*); composite retail price of pork in cents per pound 
(V p ); income, i.e., index of payrolls lagged three months, 1923-25 = 100 
(/). The data are annual figures from the period 1922-33. 

The multiple regression equation is: 

x b = 3.4892 - 0.0899^ + 0.0637g p + 0.0187/ 


10 A Girshick and T Haavelmo: “Statistical analysis of the demand for 
food," Econometrica , vol. 15 (1947), pp. 79 ff. 

582 ff*’ SchU,tZ: TllC Thcor y and Measurement of Demand (Chicago), 1938, pp. 



40 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.1 


This is actually an equation which gives us the least squares estimate 
of x b if y b y y P j and / are known. 

Schultz derived from this equation the following elasticities: price 
elasticity of beef with respect to the price of beef, — 0.49; price elasticity 
of beef with respect to the price of pork (cross elasticity), -j- 0.46; income 
elasticity of the demand for beef, + 0.36. These elasticities are derived 
by using the averages over the whole period. 

These results can be interpreted in the following way: If other things 
remain equal and only the price of beef increases by 1 per cent, then the 
demand for beef will decrease by almost l / 2 of I per cent. If other things 
remain equal and the price of pork increases by i per cent, then the demand 
for beef will increase by about 1 / 2 of I per cent. This shows that beef 
and pork are substitutes in consumption; the cross elasticity measures 
in a fashion the degree of substitutability. 12 If other things remain equal 
and the payrolls increase by I per cent, then the demand for beef will 
increase by more than 1 / 3 of I per cent. 

As possible application to policy, let us consider two cases. Assume 
that the government deals with a situation in which the results of the 
analysis hold true. It contemplates the increase of pork prices by 10 per 
cent, for instance by government price fixing. Then it must face the fact 
that the demand for beef will also increase by about 4.6 per cent, and it 
must make allowance for the increased demand for beef. This may be 
very important in a comprehensive system of agricultural planning. 

In the second hypothetical case, assume that the government decides 
to increase the total earnings of the workers by 10 per cent, for instance 
by introducing minimum wages. Then it has to face the fact that the 
demand for beef will increase by approximately 3.6 per cent. The 
government has to make allowance for this in its planning of agricultural 
production. 

The validity of the results is subject to qualifications similar to those 
mentioned in connection with Example I. 

Among later work in the field of demand analysis we may mention 
the very interesting investigations of R. Stone into the demand for a 
number of English and American commodities. 13 He analyzes the demand 


12 J. R. Hicks: Value and Capital (2nd ed., Oxford, 1948), pp. 46 ft'. J. L. 
Mosak: General Equilibrium Theory in International Trade (Bloomington, Ind.. 
1944), pp. 22 ft\ 

13 R. Stone: “The analysis of market demand." Journal of the Royal Statis¬ 
tical Society , vol. 108 (1945), pp. 1 ff S. Malmquist: A Statistical Analysis 
of the Demand for Liquor in Sweden (Uppsala, 1948). J. Tobin: “A statistical 
demand function for food in the United States." Journal of the Royal Statistical 



3.1] 


DEMAND FUNCTIONS 


41 


for the following commodities: beer, spirits, tobacco, drink and tobacco, 
soap, telegrams in the United Kingdom; food, tobacco, household equip¬ 
ment, automobiles in the United States. Stone is more aware of and 
concerned with problems of identification (section 6.5 and Chapter 7) 
than was Schultz. He also tries to protect himself against the dangers of 
multicollinearity (section 6.5) by using bunch map analysis. 

We present his analysis for the consumption of beer in the United 
Kingdom. He uses annual data, 1920-38. Let <y be the quantity of beer 
consumed, Q the aggregate real income, p the average retail price of beer, 
and 77 the average retail level of all other commodities; g is an index of 
the strength of beer. Then the result of a careful analysis is the regression 
equation representing the demand function for beer: 


(3) 


<y = 1.058 Q° AW p 


0.727 0.914 u.SIC, 

77 £ 


Actually, a linear least squares regression analysis has been performed 
with the logarithms of the variables. But the result is here presented in 
non-logarithmic form. 

We note first that the income elasticity of the consumption of beer is 
0.136. An increase in income by 1 per cent will bring about, other things 
being equal, an increase in beer consumption of not quite V- of 1 per cent. 

The price elasticity of the demand for beer is negative. An increase 
of I per cent in the price of beer will bring about a decrease in demand 
of about 7 / lw of I per cent, ceteris paribus. The positive exponent of g 
indicates that consumers prefer strong beer to weak beer. 

Let us again consider an application for policy. Assume a government 
composed of teetotallers w ho want to discourage the consumption of beer. 
They may try to achieve this goal by fixing the price of beer. They must 
be aware of the fact that they have to raise the price of beer by almost 
14 per cent in order to bring about a decrease in the consumption of beer 
by 10 per cent, if other things remain equal. The increase in the price of 
beer may also be brought about by specific taxation. 

Example 4. The demand functions for non-durable consumers' goods, 
like the ones just exemplified, are relatively simple. But if we come to 
demand functions of producers' goods, we are at once dealing with the 
more complicated phenomenon of derived demand. 

The pioneer work in this very difficult field has been done by Whitman, 
who in a remarkable article published in Econometrica 14 dealt with the 


Society , series A, vol. I 13 (1949), pp. I 13 ff. R. Stone: The Role df Measure¬ 
ment in Economics (Cambridge, 1951), pp. 71 ff. 

11 R. H. Whitman: “The statistical law of demand for a producers' good, 
as illustrated by the demand for steel," Econometrica , vol. 4 (1936), pp. 138 ff. 



42 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.1 


demand for steel and considered various forms of this law of demand. 

We will present as an example a dynamic demand function, which involves 
the time derivative of the price. 

Whitman used monthly data in the period 1921-30. We use the follow¬ 
ing notation: y, index of sales of steel in millions of gross tons; p , price 
of steel in cents per pound, corrected for trend; dpldt , rate of change of 
the price of steel over time (this is approximated by the first differences 
of P) ■> A index of industrial production; /, time. The demand function 
for steel, determined by the classical method of least squares, is: 

< 4 ) y = 1.49 - 1 .21 p 4 - 6.21(dpfdt) 4 - 4.64/- 0.03/ 

All the regression coefficients except the last one are statistically signi¬ 
ficant. This indicates that there was probably no very pronounced 
upwards or downwards shift in the demand for steel over the period. 

It should be noted that the regression coefficient of p (— 1.27) is negative. 
To a rising price of steel corresponds, ceteris paribus , a decline in demand 
for steel. The coefficient of the time derivative of the price of steel, dp/dt , 
is positive (-f 6.27) and much larger than the regression coefficient of p. 
This indicates the highly speculative character of the demand for steel. 

If the price of steel is rising, consumers of steel expect this rising tendency 
to continue. Hence they buy more steel. If the price of steel falls, they 
again expect this falling tendency to continue. Hence they buy less 
steel. Theoretical models of this type have been analyzed by Evans 
and Roos. 15 

The coefficient of the index of industrial production is also positive. 

If business improves, the demand for steel increases. All these results 
are in agreement with the ideas held by theoretical economists regarding 
the demand for a producers’ good. 

The speculative character of the demand for steel may be of importance 
in economic policy. The government must be aware, for instance, that. 
ceteris paribus , a rising tendency in the price of steel is of great importance 
if a rise in the demand for steel is desired. This may be brought about, 
for instance, by placing government orders for steel (e.g., for public works) 
in an appropriate way in order to achieve greater private consumption 
of steel. The success of such a policy depends of course upon the stability 
of the demand relationship. 


15 G. C. Evans: Mathematical Introduction to Economics (New York, 1930). 
C. F. Roos: Dynamic Economics (Bloomington, Ind., 1934), pp. 14 ff. H. T. 
Davis: The Theory of Econometrics (Bloomington, Ind., 1941), pp. 377 ff. 
See also G. Tintner: “The theoretical derivation of dynamic demand curves," 
Econometrica , vol. 6 (1938), pp. 375 ff. 



3.2] 


SUPPLY FUNCTIONS 


43 


There are again some objections to the statistical methodology used, 
similar to the ones mentioned in connection with Example 1. 

Example 5. Another interesting econometric problem in this field is 
the demand for durable consumers’ goods. Here we can refer to the 
work of C. F. Roos and V. von Szeliski, 16 which deals with the demand 
for passenger automobiles in the United States. 

Let S be the replacement sales of automobiles; /, supernumerary 
income, i.e., difference between national income and living cost; P, 
average price per car; and T an index for the scrapping of cars. The 
formula for the demand for automobiles is then: 

(5) 5 = 0.92/ 107 p-° ~* r ] U) 

This estimate has been derived by least squares regression. 

The elasticity of the demand for automobiles with respect to price is 
—0.74; the income elasticity is 1.07. The fact that the income elasticity 
of the demand for automobiles is larger than the price elasticity is in 
agreement with what we would expect for a commodity which is a semi¬ 
luxury item, like automobiles. 

To give an example of the use of such a demand function in economic 
policy, let us assume that the government succeeds in increasing the 
scrapping of cars by 10 per cent. This may, for instance, be accomplished 
by forbidding the use of the roads to cars above a certain age, a policy 
which has also other advantages because it lowers the probability of 
accidents. Then, other things being equal, the demand for automobiles 
will increase by about 1 I per cent. 

A critical evaluation of statistical demand functions has been given by 
G. Stigler. 17 

The statistical derivation of demand functions and related functions will 
be illustrated in the following examples: section 3.5, Example 2; section 
6.4, Example I ; section 6.5, Examples 3, 4, and 6; section 7.2, Example 1 ; 
section 7.3, Example 1 ; section 10.3.7, Example 1 ; section 10.5, Example 
1: section 11.1.2, Example 1; section 11.3, Examples I and 2. 

3.2 Supply Functions 

It is interesting to note but not easy to explain that the derivation of 
supply functions has attracted much less interest among econometricians 
than the analysis of demand functions. A good example is the derivation 

16 C. F. Roos and V. von Szeliski: The Dynamics of Automobile Demand 
(General Motors Corp., Detroit, 1939), pp. 21 IT. H. T. Davis: op. cit ., pp. 397. 

G. J. Stigler: “The limitations of statistical demand curves," Journal of 
the American Statistical Association , vol. 34 (1939), pp. 469 fT. 



44 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 

of the supply function for sugar by Henry Schultz. 1 This is accomplished 
together with the analysis of the demand function of sugar. 

Example 1. In deriving the law of supply for sugar for the world 

market Schultz used link relatives. These are the ratios between two 

consecutive prices or quantities. This device was used very often in older 
demand and supply analysis. 

We have the following data: Y denotes link relatives of the New York 
wholesale price of granulated sugar in cents per pound. These prices 
are percentage deviations from the trend. The trend is a cubic X 
denotes link relatives of the world production of sugar in thousands of 
short tons; these are again the percentage deviations from a cubic trend. 
Schultz utilizes annual data. The period used is 1903-14. 

A lag correlation is performed (section 10.3.5); i.e. the price of each 
year is correlated with the supply (quantity produced) of the following 
year. This lag of one year is justified by the consideration that the amount 
of sugar produced depends upon the price which prevailed at the time 
when production was started, i.e., the price about one year before the 
sugar is actually brought on the market. 

The best fit between the link relatives is obtained by simple least squares 
regression: 

(0 Y == 1.6788 Y — 0.751 

Using the averages of the data over the period considered, we can derive 
the average elasticity of supply: 0.57. This is to say, an increase of 1 per 
cent in the price will result in an increase of the production of sugar of 
about 0.57 per cent, if other things are equal. 

It is very unlikely that the relationship estimated by Schultz is still 
valid at the present period. But if it should still hold we could make the 
following statement: 

Suppose that a world planning authority or a world sugar trust wanted 
to increase the production of sugar by 10 per cent by fixing its price. 
Then, ceteris paribus, it would have to manipulate a price increase of 
about 17.5 per cent in order to achieve this result. 

Example 2. An example from the author's own work illustrates the 
so-called problem of identification and also the simultaneous utilization 
of demand and supply functions in economic policy. The problem of 
identification will be discussed in section 6.5 and Chapter 7. 

1 H. Schultz: Statistical Laws of Demand and Supply with Special Application 
to Sugar (Chicago, 1928), pp. 162 fT. See also G. S. Shepherd: Agricultural 
Price Analysis (3 ed., Ames, Iowa, 1950), pp. 52 fT. D. G. Johnson: “The 
nature of the supply function of agricultural products," American Economic 
Review, vol. 40 (1950), pp. 539 ff. 




3.2] 


SUPPLY FUNCTIONS 


45 


One problem which arises in connection with identification is the 
following: We have data for the prices paid and for the quantities sold 
artfi bought of a certain commodity on an isolated market. If we com¬ 
pute, for instance, the regression of the quantity on the price, then we 
cannot be sure whether we derive the demand function or the supply 
function of this commodity or a mixture of both. If there is a fixed 
period of production, then we may find the regression of the quantities 
with the contemporaneous prices and achieve an estimate of the demand 
relationship. And we may find the regression of the quantities with the 
prices which have been lagged by the length of the period of production 
and estimate the supply relationship, at least approximately (Example 1). 
This is the method used by Henry Schultz. 

In most cases, where both demand and supply functions fluctuate, we 
cannot achieve a fit of either the demand or the supply function without 
bringing other factors into our analysis. 

Let us again use the example of the demand and supply functions for 
agricultural commodities in the United States. 2 We have the following 
yearly data: X x , prices received by farmers (index, 1910-14 - 100); 
X 2 , national income in billions of dollars; X 3 , agricultural production 
(index, 1935-39 - 100); X A , time (origin between 1931 and 1932); X 5 , 
prices paid by farmers (index, 1910-14 - 100). The data are taken from 

the period 1920-43. Annual figures were used. 

If we use least squares methods to compute the linear regression of X 3 
on A'j, or of X x on X :i , we obtain two equations. The first gives the 
estimates of the quantity of agricultural products, if the price is known. 
The second is the estimate of the price for agricultural products, if the 
quantity is known (see Chapter 2). The two relationships cannot be 
identified as demand or supply functions. 

Making certain assumptions about the statistical nature of the data 
and using weighted regression methods which will be explained later 
(section 6.5), we can derive a demand function and a supply function for 
agricultural products in the United States. We make for the purpose 
of identification the following assumption, which is only approximately 
justified: The variable X 2> national income, enters into the demand 
function for agricultural products but not into the supply function. The 
variable X b , prices paid by farmers, enters into the supply function of 
agricultural products but not into the demand function. There is also 
presumably not a very close relationship between national income and 
prices paid by farmers. 

2 G. Tintner: “Multiple regression for systems of equations," Econometrica , 
vol. 14 (1946), pp. 5 fl* 



46 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 

It should perhaps be mentioned that these assumptions can to a certain 

extent be checked by a statistical test which permits the approximate 

estimation of the number of linear relationships existing between a given 

set of variables. 3 This method will be discussed in section 6.5. 

We achieve then on the basis of our assumptions the following demand 
and supply functions: 

(2) * 3 = 82.297— 0.097*! + 0.424 * 2 + 0.313* 4 

x i = 374.097 + 1.721*! + 0.809 * 4 — 3.611* 5 

The statistical fit is here not a simple least squares fit but is made by 
the method of weighted regression. It is assumed that there are errors 
in all the variables (except time) but no errors in the equations. Errors 
in the equations will be treated in Chapter 7. This is to say: We assume 
that all the relevant determining factors of demand and supply appear 
in our equations. This assumption is of course not strictly satisfied. 

Let us now assume that the American government levies a tax on 
agricultural products or gives a subsidy which may be considered a nega¬ 
tive tax. This tax shifts the supply function upwards by the amount of 
the tax /. A subsidy shifts it downwards by the amount /. 4 After the 
imposition of a tax of amount / or a subsidy of —/ per unit of agricultural 
produce the new supply function is: 

(4) X 3 = 374.097— 1.721 / + 1.721 + 0.809Y 4 - 3.611JT 5 

Assume now that the following values of the variables other than price 
and quantity of the agricultural products are given: X 2 = 72.375, X 4 = 0, 

X 5 = 136.460. These quantities correspond to the averages for the 
period considered. 

Substituting these quantities into our demand function (2) and into 
the supply function after the tax or subsidy (4) we have: 

< 5 ) X Y = 127.417 + 0.947/ 

This equation (5) gives the equilibrium price after the imposition of the 
tax or subsidy. 

(6) X 3 = 100.625 - 0.092/ 

This equation (6) gives the amounts sold and purchased after the 
imposition of the tax or subsidy. 

The two last equations (5) and (6) show that, if / = 0, i.e., if no tax 


3 G. Tintner: “A note on rank, multicollinearity and multiple regression ” 
Annals of Mathematical Statistics , vol. 16 (1945), pp. 304 ff. 

4 K. E. Boulding, Economic Analysis (rev. ed.. New York, 1948), pp. 727 ff. 




3.3] 


COST FUNCTIONS 


47 


is imposed and no subsidy given, then the price will be 127.417 and the 
amount exchanged on the market will be 100.625. These are the average 
values of the two variables X l and X 3 during the period of the analysis. 

Assume now that a tax of t = 10 is imposed. This may, for instance, 
be achieved by imposing a sales tax of 10 for each unit of agricultural 
produce. If this measure does not otherw ise influence demand and supply 
conditions, the equilibrium price is now 136.887 and the equilibrium 
quantity is 99.705. 

On the other hand, suppose that the government pays a subsidy of 
10 for each unit of agricultural produce (t — —10). Then the new' 
equilibrium price is 117.947, and the quantity of agricultural products 
sold is now 101.545. We assume again that the subsidy does not other¬ 
wise influence conditions on the market of agricultural commodities. 

We have of course made the assumption throughout these examples 
that other things remain equal. Tastes and technology are unchanged. 
The national income and prices paid by farmers are assumed not to be 
influenced by the tax or subsidy on agricultural products. In fact they 
are assumed to have their average values for the period analyzed. This 
may of course not be true. But then we have to analyze demand and 
supply of agricultural products within the framework of general equili¬ 
brium and not consider them only on an isolated market. Problems of 
this type will be treated in section 3.7. 

There are also some difficulties with the statistical methods involved, 
which will be discussed in connection with Example 3 of section 6.5, w here 
a more detailed account of the methodology will be given. 

The statistical derivation of supply functions and related functions will 
be illustrated in the following examples in the later chapters of this book: 
section 6.5, Examples l, 3, 4, and 6; section 7.2, Example 1; section 
10.3.5, Examples 1 and 2; section 10.3.7, Example 1; section 11.1.2, 
Example 1. 

3.3 Cost Functions 

The fitting of empirical cost functions has been the particular concern 
of Professor Joel Dean. He has analyzed the cost of leather belts, 1 the 
cost of hosiery, 2 and the cost of a department store. 3 

1 J. Dean: “The relation of cost to output for a leather belt shop,” Technical 
Paper 2, National Bureau of Economic Research (New York, 1941). 

2 J. Dean: “Statistical cost functions of a hosiery mill,” Studies in Business 
Administration , School of Business, University of Chicago, vol. II, no. 4 
(Chicago, 1941). 

J. Dean: “Department-store cost functions,” in O. Lange el a /., ed.: 



48 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 

Example 1. Dean derives cost functions for a hosiery knitting mill 4 
using monthly data for the 5 years 1935-39. Denoting the combined 
cost in dollars by X 1 and the output in dozens of pairs by X 2 , he estimates 
the total cost function by simple least squares regression: 

O X, = 2935.59 + 1.998X 2 

It is somewhat doubtful whether the application of the classical method 
of least squares can give us valid estimates in this case. The fact that 
the cost function is only one of a system of simultaneous relationships 
has been neglected (section 6.5 and Chapter 7). There is also no con¬ 
sideration of errors in the variables (section 6.5) and the possible lack of 
independence of consecutive observations (Part 3). 

Statistical tests show that the regression coefficient is statistically signi¬ 
ficant. They also reveal that the assumption of linearity for total cost 
is justified, within the range of the given data. Dean fitted also quadratic 
and cubic cost curves and found that these equations did not compare 
favorably with the linear relationship given above. 

The marginal cost of producing hosiery is constant, and its best estimate 
is 1.998. This is (approximately) the additional cost accruing if one more 
small quantity (i.e., one more dozen pairs) is produced. 

Linear total cost or constant marginal cost functions have also been 
established by Dean for the other cost functions which he has estimated. 
This fact seems to be in contradiction to the form of the short-run cost 
function which is generally assumed by theoretical economists. 

The economist assumes 0 in general that the total cost functions have a 

certain amount of curvature and are not straight lines. The short-run 

marginal and average cost functions derived from the theoretical total 

cost curve are assumed to be first decreasing and then increasing after 

reaching a minimum. It is very interesting that the investigations of Dean 

do not seem to verify this a priori notion of the economists about short-run 
costs. 

The standard error of the regression coefficient (1.998) is 0.034. The 
number of degrees of freedom is 58. Hence, we can here use the normal 


Studies in Mathematical Economics and Econometrics , in Memory of Henry 
Schultz (Chicago, 1942), pp. 222 ff. 

4 J. Dean: “Statistical cost functions of a hosiery mill," Studies in Business 
Administration , School of Business, University of Chicago, vol. 11, no. 4 
(Chicago, 1941), pp. 19 ff. 

J.. Robinson: The Economics of Imperfect Competition (London, 1933). 

J. Viner. Cost, in Encyclopaedia of the Social Sciences , vol. 3 (New York, 
1937), pp. 466 ff. 




3J] 


COST FUNCTIONS 


49 


approximation to the /-distribution, assuming the errors or deviations to 
be normally distributed and mutually independent. At ihe 5 per cent 
level of significance we have / — 1.96. 

The fiducial or confidence limits for the marginal cost are at the 95 per 
cent level 1.931 and 2.064. They are close together because the standard 
error is small. 

These results might be important for a government if it decided to 
place large orders of hosiery, e.g., for the army. Then if the results of 
the study are valid each dozen pairs of hosiery would add about S2.00 
to the cost of a firm which was already operating. This could, for instance, 
be compared with the cost which would arise if the government decided 
to produce the hosiery itself. 

Example 2. Another interesting example of fitting cost functions is the 
investigations of T. O. Yntema into the cost of steel.* He bases his study 
upon data taken from the records of the U.S. Steel Corporation, 1927-38. 
Annual figures are used. 

Let Q be the total cost of making steel, measured in millions of dollar. 
That is operating cost plus idle plant expense plus bond interest minus 
purchase discounts minus intercompany transactions. Let u be the 
amount of steel, measured in millions of tons of steel shipped. Then the 
total cost of making steel is estimated by the relationship: 

(2) Q = 182.1 + 55.734;/ 

This equation is derived by the classical method of least squares. It is 
again doubtful whether this method can yield valid estimates. 

It is remarkable that (in the relevant interval covered by the data) the 
total cost of making steel seems to be a linear function of the amount 
produced. Hence the marginal cost is constant. Within the range of 
the data, the marginal cost of making steel appears to be 55.734. That 
is to say, it costs, other things being equal, about 55.734 million dollars 
to produce an additional million tons of steel. 

The importance of this fact of constant short-run marginal cost dis¬ 
covered by all investigators of statistical cost functions contradicts the 
a priori assumptions of the economists. There are a variety of possible 
explanations, all of them more or less plausible. 

8 T. O. Yntema: “Steel prices, volume and cost," in United States Steel 
Corporation , T.N.E.C. Papers , vol. I (New York, 1940), pp. 231 ff. See also 
K. H. Wylie and M. Ezekiel: “The cost curve for steel production," Journal 
of Political Economy , vol. 48 (1940), pp. 777 ff; “Cost functions for the steel 
industry," Journal oj the American Statistical Association , vol. 36 (1941), pp. 
91 ff. J. A. Nordin: “A note on a light plants cost curve," Econometrics , 
vol. 15 (1947), pp. 231 ff. 



50 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.3 


(a) The range of the data is not great enough to cover the sections of 
the cost curve where increasing or decreasing marginal costs appear. 
The data cover only the middle section of the “real” cost curve around 
the point of inflection. This part of the total cost function may easily 
be approximated by a straight line. In other words, the marginal cost 
curve has a very flat minimum, which may be approximated by a line 
parallel to the quantity axis within the range covered by the data. 

( b ) The assumptions of the economists are wrong, and we have actually 
in the economy constant marginal cost, at least over the relevant section 
of the cost curve. This would correspond to the assumption that within 
the relevant range of the data the factors of production are combined in 
more or less constant proportions. 7 This again would lead to the con¬ 
clusion that there are (within a wide range) no discernible economies or 
dis-economies of large-scale production. 8 The fact that we can observe 
in many industries enterprises of various sizes which all seem to survive 
together would point in the same direction. This situation prevails, for 
instance, in the American steel industry. 

(<?) The observed empirical cost curves are actually cost curves of 
enterprises functioning in a dynamic economy which is subject to the 
ups and downs of business fluctuations. The cost curves contemplated 
by the theoretical economists are static and hence not relevant in this 
situation. It becomes profitable and indeed necessary for these enter¬ 
prises to be organized with due regard to flexibility and adaptability. 
This fact, pointed out by Stigler, 9 Hart, 10 and others, 11 would lead to total 
cost curves which are linear or nearlv linear in the middle section, white 
they may have strong curvature elsewhere. The average and marginal 

7 E. Chamberlin: The Theory of Monopolistic Competition (6th ed., Cambridge, 
Mass., 1948), pp. llOff. G. J. Stigler: Production and Distribution Theories 
(New York, 1941), pp. 320 ff. P. A. Samuelson: “Abstract of a theorem con¬ 
cerning substitutability in open Leontief systems,” in T. C. Koopmans, ed.: 
Activity Analysis of Production and Allocation (New York, 1951), pp. 142 ff. 

8 G. J. Stigler: op.cit. K. Menger: “Bemerkungen zu den Ertragsgesetzen,” 
Zeitschrift fur Nationaloekonomie , vol. 7 (1936), pp. 25 ff.; “Weitere Bemer¬ 
kungen zu den Ertragsgesetzen,” ibid ., pp. 388 ff. 

9 G. J. Stigler: “Production and distribution in the short run,” Journal of 
Political Economy , vol. 47 (1939), pp. 308 ff. 

10 A. G. Hart: “Imputation and the demand for productive resources in 
disequilibrium,” Explorations in Economics (New York, 1936), pp. 268 ff. 
“Anticipations, uncertainty and dynamic planning,” Studies in Business Adminis¬ 
tration, , vol. 11, no. 1 (Chicago, 1940), pp. 25 ff. 

11 G. Tintner: “The theory of production under nonstatic conditions,” 
Journal of Political Economy , vol. 50 (1942), pp. 645 ff. 



3.4] 


PRODUCTION FUNCTIONS 


51 


cost curves have very flat minima, since the enterprise is not constructed 
to function most efficiently at a single given output but to be reasonably 
efficient for a wide range of possible outputs. Average and marginal 
costs rise sharply if less or more than the range of normal output is 
produced, which extends from very low' production in a mild depression 
to reasonably high production in a mild boom. Experience with Ameri¬ 
can production conditions in the great depression 12 and the appearance 
of bottlenecks 13 during the second World War would tend to confirm this 

last explanation. A critical evaluation of statistical cost functions has 
been given by H. Staehle. 14 


3.4 Production Functions 

This is a field of paramount importance in econometrics. Production 
functions, which embody the technical and technological conditions of 
production, are in a sense more basic than cost functions. If factor prices 
are given, then cost functions can be derived from production functions. 

A new kind of production functions has been introduced into the 
discussion recently under the title, linear programming. 1 These methods 
are similar to Leontief's ideas discussed in section 3.6. 

Production functions have been very well explored and widely dis¬ 
cussed, thanks mainly to the efforts of one of the great pioneers of econo¬ 
metrics, Professor Paul H. Douglas of the University of Chicago (now a 
senator of the United States). Starting with an article (together with 
C. W. Cobb) in 1928, 2 he pursued the subject in his monumental Theory 
oj Wages* Together with his associates he has derived production 
functions for the following regions and periods: United States, 1899- 

1922; 1 Massachusetts, 1890-1926 ; 5 New South Wales, 1901-27;° Victoria, 


L. C. Robbins: The Great Depression (New York, 1934). 

O. Lange: Price Flexibility and Employment (Bloomington, Ind.. 1944) 
pp. 2 flf. 

H. Staehle: “Statistical cost functions: appraisal of recent contributions." 
American Economic Review, vol. 32 (1942), pp. 321 ff. 

M. K. Wood and G. B. Danzig: "Programming of interdependent activities. 
I. General discussion," Econometrica, vol. 17 (1949), pp. 193 ff.; "II. Mathe¬ 
matical model," ibid., pp. 200 ff. T. C. Koopmans, ed.: Activity Analysis of 
Production and Allocation (New York, 1951). 

P. H. Douglas and C. W. Cobb: "A theory of production," American 
Economic Review, vol. 18 (1928), supplement, pp. 139 ff. 

3 P H Douglas: Theory of Wages (New York, 1934). 

P. H. Douglas and C. W. Cobb: op. cit. 

P. H. Douglas: op. cit., pp. 113 ff., 159 ff. 

(> Ibid., pp. 167 ff. 



52 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.4 


1907-29. 7 These studies are based upon time series. Cross-section 
data (i.e., data for various industries) have been used in the following 
studies: United States, 1909; 8 Victoria, 1910-11, 1923-24, 1927-28; 
Australia, 1934-34; New South Wales, 1933—34; 9 United States, 1919; 10 
Australia, 1912, 1922-23, 1926-27, 1936-37; 11 United States, 1914, 12 
1904; 13 Canada, 1923, 1927, 1935, 1937. 14 

Example 1. We will illustrate the efforts of Paul Douglas by presenting 
one of his earlier published analyses in the field of production functions, 15 
a production function for manufacturing in the United States. 

The data are for the United States, for the years 1900-22. All the 
series utilized are annual. Let P be production (index of physical volume 
of manufacturing); L, labor (index of the average number of wage earners 
in manufacturing); and C, capital (index of the fixed capital in manufac¬ 
turing). All are expressed as indices, 1899 = 100. 

Douglas fits a function which is linear in the logarithms. He uses 
classical least squares regression. This statistical procedure assumes that 
there are no errors of observations (section 6.5). The identification 
problem (Chapter 7) is also neglected. He assumes that the sum of the 
two regression coefficients (exponents) is 1. In non-logarithmic form 
the production function appears as follows: 

(1) P = 1.01 L°- 7S C 0,2r> 

7 P. H. Douglas and M. L. Handsaker: “The theory of marginal productivity 
tested by data for manufacturing in Victoria,” Quarterly Journal of Economics , 
vol. 52 (1937), pp. 1 IT., 215 ff. 

8 P. H. Douglas and M. Bronfenbrenner: “Cross section studies in the 
Cobb-Douglas function,” Journal of Political Economy , vol. 47 (1939), pp. 
761 fT. 

9 P. H. Douglas and G. Gunn: “Further measurements of marginal produc¬ 
tivity,” Quarterly Journal of Economics , vol. 54 (1940), pp. 399 ff. 

10 P. H. Douglas and G. Gunn: “The production function for American 
manufacturing in 1919,” American Economic Review , vol. 31 (1941), pp. 67 ff. 

11 P. H. Douglas and G. Gunn: “The production function for Australian 
manufacturing,” Quarterly Journal of Economics , vol. 56 (1941), pp. 108 ff. 

12 P. H. Douglas and G. Gunn: “The production function for American 
manufacturing in 1914,” Journal of Political Economy , vol. 50 (1942), pp. 595 ff. 

13 P. H. Douglas, P. Daly, and E. Olson: “The production function for 
manufacturing in the United States in 1904,” Journal of Political Economy , 
vol. 51 (1943), pp. 61 ff. 

14 P. H. Douglas and P. Daly: “The production function for Canadian 
manufacture,” Journal of the American Statistical Association , vol. 39 (1943), 
pp. 178 ff. 

16 P. H. Douglas and C. W. Cobb: op. cit. See also H. T. Davis: Theory 
of Econometrics (Bloomington, Ind., 1941), pp. 159 ff. 



3.4] 


PRODUCTION FUNCTIONS 


53 


The exponents arc elasticities. 0.75 is the elasticity of production with 
respect to labor. Equation (1) indicates that the increase of labor by 
1 per cent will, other things being equal, bring about an increase of (about) 
3 / 4 of 1 per cent in the product. Similarly, if capital increases by 1 per 
cent, then, ceteris paribus , ihe total product will increase by (about) 
V 4 of 1 per cent. 

It is here assumed that the sum of the exponents is 1. In the later 
studies cited above this assumption is not made, but still the sum of the 
exponents remains very close to I. Ihe empirical production function 
appears as a linear homogeneous function of the factors of production. 10 
If the amounts of all factors of production are doubled, then the product 
is also doubled, etc. 


This would indicate that in all the countries investigated, in the great 
variety of periods analyzed, there are no discernible economies or dis¬ 
economies of large-scale production. If these studies have some reli¬ 
ability, we obtain the somewhat paradoxical result that, on the whole, 
small and large enterprises are approximately equally profitable, within 
the (wide) range of the data used. 

Example 2. Douglas' conclusion, that the production function is a 
linear homogeneous function of the amounts of the factors used, at least 
in a certain range, is to a certain extent substantiated by an investigation 
of the author based upon an analysis of 609 business records of Iowa 
farms in 1942. 1 ' This production function for Iowa farms is derived 
from 609 actual farm records. 

The variables are X, the total product, measured by gross profits; 
A. land, number of acres used by the farm; /?, labor, total number of 
months of labor, both hired and family labor of the operator; C, farm 
improvements, i.e., fences, etc.; D, liquid assets—cattle, poultry, hogs, 
sheep, feed, seeds, and supplies; £, working assets—breeding cattle. 


16 E. Schneider: Theorie der Produktion (Vienna, 1934). R. G. D. Allen: 
Mathematical Analysis for Economists (London, 1938), pp. 502 fT. S. Carlson: 
A Study in the Pure Theory of Production (London, 1939). P. H. Wicksteed: 
The Coordination of the Laws of Distribution , reprint (London, 1932). 

*' G. Tintner: “A note on the derivation of production functions from farm 
records," Econometrica , vol. 12 (1944), pp. 26 fT. See also C. G. Hildreth: 
A Study of Production Functions from Farm Record Data (unpublished thesis, 
Ames, Iowa, 1947). W. H. Nichols: Labor Productivity Functions in Meat 
Packing (Chicago, 1948). M. J. J. Verhulst: “The pure theory of production 
applied to the French gas industry," Econometrica , vol. 16 (1948), pp. 295 fT. 
E. O. Heady: "Production functions from a random sample of farms," Journal 
of Farm Economics , vol. 28 (1946), pp. 989 fT. 




54 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.4 


horses, tractors, crop machinery, livestock equipment, special machinery, 

trucks, farm share of the automobile. All variables except A and B are 
measured in dollars. 

A Douglas-type production function, i.e., a function which is linear in 
the logarithms, was fitted to the data. The reasons for this choice are 
as follows: (1) it yields elasticities immediately, as indicated above; (2) it 
permits the phenomenon of decreasing returns to come into evidence 
with the use of the least complicated function. This would not be the 
case if we, for instance, should choose a function which is linear in the 
original data. The method of fitting is the classical method of least 
squares. There are again some objections to the use of this method. 
The elasticities of the various factors of production are indicated in 

Table 1. These are simply the regression coefficients of log X on log A , 
log B, etc. 

TABLE 1 


Land A 
0.288 


Labor B 
0.158 


Improve¬ 
ments C 

0.054 


Liquid 
Assets D 

0.212 


Working 
Assets E 

-0.005 


Cash Operating 
Expenses F 

0.159 


All the regression coefficients except the one for working assets, E 
(—0.005), are significant at the 5 per cent level of significance. Hence 
we conclude that the negative value of this elasticity could have arisen 
by chance. Working assets have apparently not exerted any appreciable 
effect upon the productivity of the Iowa farms analyzed. 

All the elasticities are smaller than 1. Hence all the factors in question 
show the phenomenon of decreasing marginal returns, 18 as we should 
expect from theoretical considerations. 

The sum of the elasticities is 0.866. The question arises whether this 
is near enough to 1, so that we may say we have probably constant returns 
to scale. This assertion can be tested statistically in the following way: 

We fit another production function under the assumption that the sum 
of the elasticities (regression coefficients) is 1. Then we compare the 
sums of squares of the residuals for both fits. These are as follows: 
for the fit without the assumption of a linear homogeneous production 
function, 7.973; and for the fit with the assumption indicated above, 
8.044. The difference of the two sums of squares is only 0.071, and can 
be tested by an analysis of variance test. The test reveals that the 


18 G. J. Stigler: Production and Distribution Theories (New York, 1949), pp. 
61 ff.' E. Schneider: Einfuehrung in die Wirtscha/fstheorie, vol. 2 (Tuebingen, 

1949), pp. 89 ff. H. von Stackelberg: Grundlagen der theoretischen Volks- 
wirtschaftslehre (Bern, 1948), pp. 33 ff. 



3.4] 


PRODUCTION FUNCTIONS 


55 


difference is not significant at the 1 per cent level of significance. A 
discussion of the test is given in section 5.3. 

Hence, we may conclude: It is not unlikely that the production function 
of* Iowa farms is a homogeneous function of degree 1 in the factors of 
production and shows no economies or dis-economies of large-scale 
production. This would to a certain extent tend to confirm the previous 
results of Paul Douglas and his collaborators. It should, however, be 
noted that our data are taken from the more prosperous and advanced 
Iowa farms and may hence not be typical for production conditions on 
Iowa farms as a whole. 

These results, if actually true, or approximately true, would of course 
be of great importance for economic theory and policy. The fact that 
we have constant returns to scale is also to a certain extent corroborated 
by the fact that we observe in many industries the co-existence and survival 
of firms of various size. This condition prevails also with farms in Iowa. 

A theoretical explanation of this phenomenon can perhaps be given 
along the lines of Kaldor’s 19 theory of the firm. There is one factor of 
production, which (by necessity) has not been included in the empirical 
production functions: This is entrepreneurship. It is likely, as Kaldor 
suggests, that it is this scarce factor of production which eventually 
causes decreasing returns to scale and actually may determine the optimum 
size of the enterprise. 

Example 3. The author may perhaps be forgiven if he illustrates the 
derivation of production functions further by one of his efforts in this 
direction. It has been remarked by critics that Douglas' production 
functions are based upon global data of capital, labor, and product taken 
from time series or cross sections of industry. Some critics have, perhaps 
not without reason, doubted the economic significance of such macro- 
economic concepts. The aggregation problem 20 is certainly involved here. 


N. Kaldor: The equilibrium of the firm," Economic Journal , vol. 44 
(1934), pp. 60 fT. 

20 F. Divisia: Economic rationelle (Paris, 1928), pp. 260 ff. L. R. Klein: 
"Macro-economics and the theory of rational behavior," Economeirica , vol. 14 
(1946), pp. 95 ff. K. May: "The aggregation problem in a one-industry 
model, ibid., pp. 285 fT. S. S. Pou: "A note on macro-economics," ibid., pp. 
299 ff. L. R. Klein: "Remarks on the theory of aggregation," ibid., pp. 303 fT. 
K. May: "Technological change and aggregation," ibid., vol. 15 (1947). pp. 
51 fT. W. W. Leontief: "Introduction to a theory of the internal structure of 
functional relationships," ibid., pp. 361 fT. A. Nataf: "Sur la possibility de la 
construction de certaines macromodeles," ibid., vol. 16 (1948), pp. 330 fT. 
A. Nataf and R. Roy: "Remarques et suggestions relatives aux nombres in¬ 
dices, ibid., pp. 330 fT. B. D. Mudgett: Index Numbers (New York, 1951). 



56 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[ 3.4 


Hence it may be interesting to investigate similar results for a single 
industry, results which are based not upon global data but upon the 
detailed accounts of 468 Iowa farmers in 1939. 21 These farmers form, 
unfortunately, not a random sample. They are not typical of average 
Iowa farmers but represent rather the more efficient and successful farmers. 

The product is represented by the net profits of the farm. The factors 
of production are land in acres and labor in total number of months. 
The other capital assets are measured in dollars: farm improvements 
(buildings, fences, etc.), liquid assets (livestock, feed, seed, fertilizer, etc.), 
working assets (farm machinery, farm share of automobile, breeding 
stock, equipment, etc.), cash operating expenses (equipment repairs, fuel, 
oil, feed purchased, etc.). The classification of the capital items is admit¬ 
tedly not ideal but is dictated largely by the bookkeeping habits of the 
farmers. 

A Douglas-type production function was fitted by the classical method 
of least squares. This is a function which is linear in the logarithms. 
The sum of the exponents turned out to be 0.9871. This is again very 
near to I, indicating the existence of constant or almost constant returns 
to scale. 

The marginal productivities which have been computed for one addi¬ 
tional dollar spent on the factor, and for the average of the farms, are 
as follows: land, 0.0981; labor, 1.0519 ; improvements, 0.0449; liquid 
assets, 0.1790; working assets, 0.1585; cash operating expenses, 0.2808. 
All the marginal productivities, except the one for improvements, are 
significant at the 5 per cent level of significance. These tests are valid 
only of the assumptions of normality and independence of the observations. 

These results are interesting for the economist and suggestive for agri¬ 
cultural policy. It has, however, to be borne in mind that they are not 
typical for Iowa farms, but are characteristic only of the more advanced 
and prosperous Iowa farmers. 

The marginal productivities may be interpreted in the following way: 

If conditions on Iowa farms are the same as on the average of the farms 
analyzed in 1939, then, all other things being equal, the addition of one 
dollar's worth of a specific factor will increase the product by the following 
amount in dollars: land, about 0.10; labor, about 1.05; improvements, 
about 0.04; liquid assets, about 0.18; working assets, about 0.16; cash 
operating expenses, about 0.28. 

This indicates that, if the results are somewhat reliable and typical, the 

21 G. Tintner and O. H. Brownlee: “Production functions derived from farm 
records," Journal of Farm Economies , vol. 26 (1944), pp. 566 ff. “A correction, 
ibid., vol. 35 (1953), p. 123. 



3.5] 


UTILITY FUNCTIONS AND ENGEL CURVES 


57 


greatest increase in agricultural productivity can be attained on Iowa 
farms by increasing labor and cash operating expenses. Hence an agri¬ 
cultural policy which aims at maximum productivity would have to 
facilitate the greater use of labor and cash operating expenses, i.e., equip¬ 
ment repairs, fuel, oil, feed purchased, etc. This could be done, for 
instance, by subsidizing the use of these items by the farmers. 

There is an extensive critical literature on the statistical derivation of 
production function. 22 

The statistical derivation of production functions and similar relation¬ 
ships will be further illustrated below by the following examples: section 
5.3, Example 1; section 6.4, Example 2; section 6.5, Examples 2 and 5; 
section 11.1.1, Example 1. 

3.5 Utility Functions and Engel Curves 

Unfortunately not very much work has been done in the field of deter¬ 
mining empirically utility functions or indifference systems. 1 These 
functions are more basic, in a certain sense, than demand functions. If 
prices and incomes are known, then demand functions can be derived 
from utility functions or indifference systems. 

A. Wald 2 developed a very interesting method which would under 
certain circumstances enable us to find (approximately) the indillerence 
systems from a knowledge of the Engel curves, i.e., the relation between 
income and expenditure on various commodities. A critical evaluation 
of the problems connected with the empirical derivations of indifference 
functions has been given by Wallis and Friedman. 5 

22 M. Bronfenbrenner: “Production functions,” Econometrica , vol. 12 (1944), 
pp. 35 fT. D. Durand: “Some thoughts on marginal productivity, with special 
reference to Professor Douglas' analysis,” Journal of Political Economy , vol. 
45 (1937), pp. 740 fT. J. Marschak and W. H. Andrews: “Random simul¬ 
taneous equations and the theory of production,” Econometrica. vol. 12 (1944), 
pp. 143 fT. FI. Mendershausen: “On the significance of Professor Douglas' 
production function,” ibid., vol. 6 (1938), pp. 143 fT. M. W. Reder: “An 
alternative interpretation of the Cobb-Douglas function,” ibid., vol. 11 (1943), 
pp. 259 fT. V. E. Smith: “Non-linearity in the relation between input and 
output,” ibid., vol. 13 (1945), pp. 260 fT. 

1 L. L. Thurstone: “The indifference function,” Journal for Social Psychology, 
vol. 2 (1931), pp. 139 fT. 

2 A. Wald: “The approximate determination of indifference systems by 
means of Engel curves,” Econometrica, vol. 8 (1940), pp. 144 fT. 

:J W. A. Wallis and M. Friedman: “The empirical derivation of indifference 
functions,” Studies in Mathematical Economics and Econometrics (Chicago, 
1942), pp. 175 flf. 



58 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.5 


We refer here to the fundamental work of Fisher, 4 Frisch, 5 and Waugh 6 
who give empirical studies of utility functions. ’ 

Example 1. We will give an account of Waugh’s analysis, 7 which is 

based upon the ideas of Frisch. 8 He tries to estimate the marginal utility 

of money m the United States. He uses yearly data for the United States, 

922 32. During this period the per capita consumption of food was 

nearly constant. Relative marginal utility of real dollars is estimated by 

the ratio between an index of cost of living, excluding food, and the index 

of food prices. This assumes the existence of a utility function which 

has a peculiar form. 9 The utility of food and the utility of non-food 

must be (approximately) independent. The marginal utility of real dollars 

U is expressed as a function of the amount of real per capita non-food 
expenditure R. 

Waugh chooses because of general considerations a specific form of this 
function and achieves a fit by the classical method of least squares: 

(1) 0.74074 

log R — log 144.67 


Marginal utility of expenditure is a decreasing function of expenditure. 

This is in agreement with economic theory. The minimum of existence 
appears here as S 144.67. 

The equation fitted by Waugh on the basis of theoretical considerations 
first supplied by Frisch has a number of properties which are desirable 
from the point of view of economic theory: 

(1) The minimum of existence is taken into account. In our case it 

appears as SI44.67, perhaps not a bad estimate for the particular period 
in the United States. 

(2) As expenditure R approaches the minimum of existence, marginal 
utility becomes infinite. This is clearly the case with our function. As 
expenditure R becomes larger and larger, marginal utility becomes smaller 
and smaller. It is finally zero for infinite income. 


I. Fisher: A statistical measure for measuring marginal utility and testing 
the justice of a progressive income tax," Economic Essays , Contributed in Honor 
of John Bates Clark (New York, 1927), pp. 157 ff. 

R. Frisch: New Methods of Measuring Marginal Utility (Tuebingen, 1932). 

6 F. V. Waugh: “The marginal utility of money in the United States from 
1917 to 1921 and from 1922 to 1932," Econometrica , vol. 3 (1935), pp. 376 ff 
W. Winkler: Grundfragen der Oekonometrie (Vienna, 1951), pp 19 ff 48 ff 

7 Ibid. 

8 R. Frisch: op. cit. 

P. A. Samuelson: Foundations of Economic Analysis (Cambridge. Mass.. 
1947), pp. 174 ff. 



3.5] 


UTILITY FUNCTIONS AND ENGEL CURVES 


59 


(3) Marginal utility is a decreasing function of expenditure for expendi¬ 
tures larger than the minimum of existence. This condition is fulfilled in 

our case. Its theoretical validity has recently been challenged by Friedman 
and Savage. 10 

(4) The elasticity of the marginal utility of money with respect to 
expenditure R is greater than unity for values of R which are small but 
greater than the minimum of existence. Waugh provides Table I. This 

TABLE I 


Real Expenditures on 
Non-Food, 1913 dollars 
per capita 

150 

200 

250 

300 


Elasticity of Marginal 
Utility with Respect 
to Expenditure 

-0.772 

0.623 

0.547 

-0.496 


table indicates that, for a family which spent only SI50 on non-food, an 
increase ol I per cent in non-food expenditure decreased the marginal 
utility of expenditure by almost 8 / 10 of I per cent. But for a family which 
spent S300 on non-food an increase of 1 per cent in expenditure on non- 

utility of expenditure by only about ■/., of 

1 per cent. 

(5) As expenditure is indefinitely increased, the elasticity defined above 
tends to become smaller and smaller until it is zero for infinite expenditure. 
1 his is clearly the case with our function (I), as indicated above. 

These results are of course subject to severe limitations, but perhaps 
point a way to further investigations of the subject. A knowledge of the 
marginal utility of money for various classes of the population would be 
of paramount importance for various aspects of the modern theory 
of welfare economics." One particular field where its importance 
has long been recognized is taxation. 12 A new theory of measurable 
utility has been given by various authors. 12 A critical analysis of the 

10 M. Friedman and L. J. Savage: "The utility analysis of choices involving 
risk. Journal of Political Economy , vol. 56 (1948), pp. 279 ff. 

" P. A. Samuelson. op. cit., pp. 219 ff. H. Myint: Theories of Welfare 
Economics (London, 1948), pp. 199 fT. 

P. A. Samuelson: op. cit., pp. 226 ff. 

" G. Tintner: "The theory of choice under subjective risk and uncertainty " 
Economeinca. vol. 9 (1941). pp. 305 ft".; "A contribution to the non-static 
ntory of choice. Quarterly Journal of Economics, vol. 51 (1942). pp. 274 ff 
K. Menger: "Das Unsicherheitsmoment in der Wertlehre." Zeitschrift far 



SO/VIE ILLUSTRATIONS OF ECONOMETRIC RESEARCH 




assumptions underlying the derivation of utility functions is due to A. 
Burke. 14 

Example 2. The method of Wald 15 has been used by Professor J. A. 
Nordin 16 in order to approximate an indifference system. This study dis¬ 
tinguished only food (x) and non-food (y), measured in dollars. Income is 
denoted by E. This method assumes that the indifference system is a poly¬ 
nomial of second degree in a small region. The Engel curves are linear. 

The data are taken from the publication, “Family spending and saving 
in war time,” U.S. Bureau of Labor Statistics, Bulletin 822. The 1941 
study is based upon about 3060 families. Incomes over $5000 have been 
excluded to preserve linearity of the Engel curves. The authors of this 
study made adjustments for the period 1935-36 in order to make the 
studies comparable. The 1935-36 study was based upon 300,000 families. 

Expenditure is deflated by price indices. Income is estimated disposable 
personal income (E). 

The Engel curves for 1935-36 are: 

(2) x = 150.328983 + 0.1728186E 

This is the Engel curve for food consumption. The Engel curve for non¬ 
food consumption is: 

(3) y = - 155.5803 + 0.847353E 

The correlation coefficient is 0.983. This is significant on the 1 per 
cent level. 

The Engel curves for 1941 are: 

(4) x = 251.8303696 + 0.178436E 

Nationaloekonomie , vol. 5 (1934), pp. 458 ff. J. von Neumann and O. Morgen- 
stern: Theory of Games and Economic Behavior (2nd ed., Princeton, 1947), 
pp. 15 ff., 617 fT. J. Marschak: “Rational behavior, uncertain prospects and 
measurable utility," Econometrica , vol. 18 (1950), pp. Ill ff. R. L. Bishop: 
“Professor Knight and the theory of demand," Journal of Political Economy , 
vol. 54 (1946), pp. 141 ff. F. H. Knight: “Comment on Mr. Bishop's article," 
ibid., pp. 170 ff. H. S. Houthakker: “Revealed preference and the utility 
function," Economica , new series, vol. 17 (1950), pp. 159 ff. P. A. Samuelson: 
“The problem of integrability in utility theory," Economica , new series, vol. 17 
(1950), pp. 355 ff. 

14 A. Burke: “Real income, expenditure proportionality and Frisch's New 
Methods of Measuring Marginal Utility ," Review of Economic Studies , vol. 4 
(1936), pp. 33 ff. J.' N. Morgan: “Can we measure the marginal utility of 
money?" Econometrica , vol. 13 (1945), pp. 129 ff. 

15 A. Wald: op. cit. 

16 I am much obliged to Professor J. A. Nordin for permission to include his 
results. He was assisted by Mrs. L. T. Smythe. 



3 . 5 ] 


UTILITY FUNCTIONS AND ENGEL CURVES 


61 


This is the Engel curve for food consumption. The Engel curve for non¬ 
food consumption is: 

(5) y = - 252.918 + 0.772755£ 

The correlation coefficient is 0.998. This is again significant on the 1 per 
cent level of significance. 

w 

Using Wald's method, the following indicator (utility index) has been 
derived: 


(6) U = - 0.000890.x 2 + 0.008353/ 2 + 0.02240l.xv + 104.572144.v 

+ 96.686771/ 

Of course, this is not the only indicator which might be used, 
could also use, e.g.: 


We 


( 7 ) 


V = a + bU 


where a is an arbitrary constant and b a positive constant. 

The marginal rate of substitution between non-food (/) and food (.x) 
is from this: 

dy _ - 0.001780.x + 0.022401/ + 104.572144 
(Tx 0.022401a* + 0.016706y + 96.686771 


Let now p x be the price of food, p y the price of non-food, and £ income. 
We can derive the demand function for food: 


( 9 ) 


x = 


£(0.022401/7,- 0.016706/7,) + p„( 104.572144/7,- 96.68677 \p x ) 

0.001780/7, 2 + 0.044802/7j./?, - 0.016706/7/ 


The demand function for non-food is: 


( 10 ) 


£(0.001780/7, + 0.022401/7 X ) — p,( 104.572144/7 (/ — 96.686771/7, ) 


0.001780/7, 2 + 0.044802/7 x /7, — 0.016706/7 


2 


Let us assume, for instance, that we have p s = I, p u — 1.1, £ = 2500. 
Then the estimated demand for food is .x = 1 152.155, and the estimated 
demand for non-food is v — 1225.307. 

9 

These and similar results should be useful for applications in economic 
policy. 

Example 3. A related approach to a study of consumers' preferences 
is the investigation of the variation of the expenditures on various items 
(e.g., food and clothing) with the change in income or total expenditure. 
The relationship which describes the variation of a specific expenditure 
in terms of the total expenditure is called an Engel curve. 




62 


SO/VIE ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.5 


Allen and Bowley 17 in a very interesting study made the attempt to 
establish empirical Engel curves for a great variety of budget data. They 
use data,from England, Germany, France, Belgium, Holland, Denmark, 
Norway, Sweden, Finland, Switzerland, Czechoslovakia, Poland, the 
United States. We present here only a very small part of their analysis 
based upon the family budgets of 1 12 clerks , collected in 1926 in London 
and other large English towns. 

Denote by ]\ the expenditure on food, rent, and clothing, by J 2 the 
expenditure on fuel and light, by/ 3 the expenditure on other items. Let 
e be total expenditure. All the variables are measured in pounds sterling. 
The average expenditure of the 112 households was £482. 

The Engel curves are fitted by simple least squares regression and are 
as follows: 

(11) j\ = 0.47c + 62.66 

(12) / 2 = 0.06c -F 4.82 

(13) / 3 = 0.47c - 67.48 


These relationships show the following features: If one additional 
pound is expended, then, other things being equal, about 0.47 of it will 
be spent on food, rent, and clothing, approximately 0.06 on fuel and 
light, and about 0.47 on all other items. 

Income elasticities of demand can also be computed from these data. 
The income elasticities computed for the average income are as follows: 
for food, rent, and clothing, 0.8; for fuel and light, 0.8; for all other 
items, 1.5. 

That is to say, if total expenditure rises by I per cent, then, ceteris pari¬ 
bus, the expenditure for food, rent, and clothing on the one hand, and 
for fuel and light on the other hand, will each increase by about 4 / 5 of 
1 per cent. But, under similar circumstances, the expenditure on other 
items will increase by approximately 1.5 percent. That is to say, a 
proportional increase in expenditure, brought about, for instance, by an 
increase in salary, will stimulate the expenditure on necessities less than 
the expenditure on other items. 

These results, if reliable, may be of some use in economic policy. 
Suppose, for instance, that the government wants clerks to increase their 


17 R. G. D. Allen and A. L. Bowley: Family Expenditure (London. 1935). 

F. H. Knight: “Marginal utility economics" in Encyclopaedia of the Social 
Sciences , vol. 5 (New York, 1937), pp. 357 ff. D. S. Brady: “Expenditure, 
saving, and income," Review of Economic Statistics , vol. 28 (1946), pp. 216 ff. 

G. Stuvel and S. F. James: “Household expenditures on food in Holland. 
Journal of the Royal Statistical Society , series A, vol. I 13 (1950). pp. 59 ff. 



3.6] 


TABLEAU ECONOMIQUE 


63 


expenditure on food, rent, and clothing by 10 per cent. Then it should 
be aware that, other things being equal and the validity of the results 
being assumed, total expenditure has to increase by 12.5 per cent. This 
increase may be brought about by an appropriate policy regarding wages 
and salaries. 

The fact that dynamic relations in consumption have been neglected 
impairs somewhat the usefulness of the results. 18 


3.6 Tableau economique 

This very simplified model of static 1 equilibrium has been known to 
economists since the time of Quesnay. 2 Static equilibrium is here defined 
as a timeless economic system. Doubtless Quesnay himself was under 

the impression that it would eventually be possible to derive statistical 
verification for his ideas. 

The pioneer effort in this field is probably due to G. C. Means. 3 His 

methods are, however, somewhat primitive; he expressed a number of 

economic variables like production, employment, and construction in 

terms of consumers' income. These relationships were derived by simple 
regression analysis. 

Example I. Much more refined and promising are the methods of 

W. W. Leon tie 1 1 of Harvard University. He has constructed what must be 

called a veritable tableau economique for the United States and found 

numerical values lor the various entries by the use of census data. His 

statistical methods are, however, somewhat primitive. For instance, he 

completely neglects errors of observations (section 6.5). This neglect is 

certainly unjustified, as far as the data upon which his analysis is based 
are concerned. 

We present in Table I the tableau economique for the last prewar 
year, 1939, compiled for the United States by Leontief. 3 In Leontief’s 

V. L. Mirkin: Income-Consumption Relationships of Certain Iowa harms 
(unpublished thesis, Ames, Iowa, 1946). 

1 J. R. Hicks: Value and Capital (2nd ed., Oxford, 1948), pp. 125 If. 

F. Quesnay: Tableau economique (Versailles, 1758; reprint H. Hiees 
London, 1894). r 66 ’ 

3 G. C. Means: Patterns of Resources Use , Industrial Committee of the 
National Resources Committee (Washington. 1939). 

4 W. W. Leontief: “Interrelations of prices, output, savings and investment " 
Review of Economic Statistics , vol. 19 (1939), pp. 109 IT.; The Structure 'of 
American Economy , 19/9-29 ( 2nd ed.. New York, 1951). 

' W. W. Leontief: “Output, employment, consumption ami investment" 
Qxancrlv Journal oj Economics, vcl. 58 (1944), pp. 290 ff.: The S,na ture of 
American Economy, 1919-29 (2nd ed.. New York 1951) p 140 



64 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.6 


table each row of figures shows the distribution of the particular output 
of one specific industry among all other industries; foreign trade (exports) 
and households are also considered separate “industries.” The sums in 
the last column do not check because of rounding. 


TABLE 1 


(Billions of dollars) 


Distribution of Outlays 


Distribution of Output of Classes for 

the First Column 



1 

i 

2 

3 

4 

1 

5 

6 

7 

8 

9 

10 

11 

Total 

1 . 

Agriculture and food 




1 

0.6 


0.6 

0.6 


0.7 

14.5 

17.0 

2. 

Minerals 

0.1 


1.2 




0.2 

1.3 


0.9 

0.1 

3.8 

3. 

Metal fabricating 

0.7 

0.1 


0.3 

0.1 

0.3 

1.1 

2.1 

0.3 

4.2 

3.0 

12.3 

4. 

Fuel and power 

0.4 

0.3 

0.4 


0.1 

0.3 

0.7 

0.4 

0.2 

2.6 

3.5 

8.9 

5. 

Textile, leather, rubber 

0.1 


0.3 




0.2 

0.1 


0.8 

5.4 

7.0 

6 . 

Railroad transportation 

1.3 

0.3 

0.4; 

1.0 

• 



0.5 

0.1 


0.7 

4.3 

7. 

Foreign trade 

1.0 

0.4 


0.1 

0.2 



0.5 


0.6 


2.8 

8 . 

Industries n.e.c. 

0.9 

0.1 

0.4 

1.0 

0.5 

0.6 

0.4 


4.6 

5.2 

5.6 

19.2 

9. 

Government 

1.1 


0.2 0.2 




0.1 


9.7 

2.6 

13.8 

10. 

Other industries 

8.2 

1.5 

3.4: 

3.1 

3.1 

0.7 

0.1 

9.1 

2.8 


28.9 

60.9 

11 . 

Households 

4.2 

l.l 

6.1, 

3.6 

2.9 

2.5 


5.5 

7.9 

i 

34.5 

1 

68.8 


Consider, for instance, the third line in the table, which refers to the 
metal fabricating industry. It shows that, of the total net output of 
12.3 billion dollars in 1939, 0.7 billion went to agriculture and food indus¬ 
tries, 0.1 to minerals, 0.3 to fuel and power industries, 0.1 to textile, 
leather, and rubber industries, 0.3 to railroad transportation, 1.1 billions to 
foreign trade (exports), 2.1 to industries n.e.c. (chemicals, lumber and wood 
industry, furniture, paper, printing, construction), 0.3 to the government 
(taxes), 4.2 to other industries (banking, insurance, advertising, various 
services, rents, laundry, amusements), and finally 3.0 to households 
(i.e., consumption). 

From this table, Leontief computes the technical coefficients of produc¬ 
tion 6 which are assumed to be constant. This assumption, which excludes 


6 G. J. Stigler: Production and Distribution Theories (New York, 1949), pp. 
232 ff. N. Georgescu-Roegen: “Fixed coefficients of production and marginal 
productivity,” Review of Economic Studies , vol. 3 (1935), pp. 214 ff. T. C. 
Koopmans: “Analysis of production as an efficient combination of activities,” 
inT. C. Koopmans, ed.: Activity Analysis of Production and Allocation {New York, 
1951), pp. 33 ff. H. M. Smith: “Uses of Leontief's open input-output model,” 
ibid., pp. 132 ff. N. Georgescu-Roegen: “Some properties of a generalized 






3 . 6 ] 


TABLEAU ECONOMIQUE 


65 


increasing or decreasing marginal returns, is a very serious limitation of 
his work. 

Example 2. In subsequent articles Leontief has utilized a similar 
system to investigate the relationships between exports, imports, domestic 
output, and employment. 7 He also investigates wages, profits, and 
prices . 8 

He comes, for instance, to the conclusion that a 10 per cent wage rise 
in all industries would produce, ceteris paribus , a price increase of about 
5.6 per cent for the products of the metal fabricating industry. A 10 per 
cent increase of wages in only the metal fabricating industry itself would 
produce, other things being equal, an increase in prices for the products 
of this industry by about 3.5 per cent. An increase of 10 per cent of 
non-wage income in all industries would raise, ceteris paribus , the prices 
in the metal fabricating industry by approximately 1.6 per cent. 

A similar increase by 10 per cent of the non-wage income in the metal 
fabricating industry alone would, other things being equal, increase the 
prices of this industry by about 0.65 per cent, etc. The failure to indicate 
the statistical reliability of these estimates is a serious shortcoming of 
Leontief's methods. No fiducial of confidence limits are given, and no 
significance tests are performed. 

It is evident that these and other results, tentative as they may be, are 
very interesting to the theoretical economist. They are also of great 
potential importance for the economist concerned with economic policy. 
In a way, they are serious rivals to the static and dynamic models con¬ 
structed on the basis of macroeconomic theory, as far as government 
planning is concerned. These models will be discussed in the next 
sections, 3.7 and 3.8. If proper statistical methods could be developed 
which would allow the statistical reliability of the findings to be ascer¬ 
tained, then these results might be ultimately of very great value in 
governmental planning. It would also be necessary to drop the assump¬ 
tion of constant coefficients of production and take certain dynamic 
features of the economy into account. 9 

Leontief model, ibid ., pp. 132 ff. H. A. Simon: “Effects of technological 
change in a linear model,” ibid., pp. 260 ff. 

W. W. Leontief: “Exports, imports, domestic output and employment,” 
Quarterly Journal of Economics, vol. 60 (1946), pp. 171 ff.; The Structure of 
American Economy, 1919-29 (2nd ed.. New York, 1951), pp. 163 ff. 

8 W. W. Leontief: “Wages, profits and prices,” Quarterly Journal of Econ¬ 
omics, vol. 61 (1946), pp. 26 ff.; The Structure of American Economy, 19/9-29 
(2nd ed.. New York, 1951), pp. 188 ff. 

J G. Tintner: “A contribution to the non-static theory of production,” in 
Studies in Mathematical Economics and Econometrics , in Memory of Henry 



66 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 

F. W. Waugh has recently given very efficient mathematical methods 

for inverting the matrix of Leontief.™ A critical account of the methods 
is due to Georgescu-Roegen. 11 

3.7 Static Models for the Total Economy 

Such models have been investigated by various econometricians con¬ 
nected with the Cowles Commission. 1 We present here as an example 

the pioneer work by one of the originators of the modern approach in 
econometrics, T. Haavelmo. 2 

Example I. He deals with the measurement of the propensity to con- 
sume in the United States. He uses annual data for the United States, 
1930-41. There are 13 yearly observations. Let us denote by c con¬ 
sumers expenditure; by y disposable income; by r gross business savings; 
by a- gross investment. All the variables are expressed in dollars per 
capita, deflated. The last variable, a: (gross investment, i.e., gross capital 
formation plus government net deficit), is assumed to be exogenous; 
i.e., it is not determined in the system but considered as given by forces 
outside the economic system analyzed. All variables are deflated by the 
Bureau of Labor Statistics cost of living index. 

The system is assumed to be linear. Haavelmo assumes, like most 
workers who use the methods developed by the Cowles Commission, 
that there are errors in the equations but no errors in the variables 
(Chapter 7). That is to say, the effect of variables which have been not 
included in the equations is taken into account. But it is assumed that, 
e.g., the data are free from errors of observations (section 6.5). This is 
a ver y unrealistic assumption in the present case. 

The parameters are estimated by an adaptation of the least squares 
regression method. This method is described in section 7.2. What are 
estimated are not the parameters but certain functions of the structural 



Schultz (Chicago, 1942), pp. 92 ff.; “The theory of production under non-static 
conditions,** Journal of Political Economy , vol. 50 (1942), pp. 645 ff. 

F. W. Waugh: “Inversion of the Leontief matrix," Econometrica , vol. 18 
(1950), pp. 142 ff. 

N. Georgescu-Roegen: “Leontief's system in the light of recent results,'* 
Review of Economics and Statistics, vol. 32 (1950), pp. 214 ff. 

hstimating relations from non-experimental observations," abstracts from 

papers presented at Cleveland, Jan. 25, 1946, Econometrica , vol. 14 (1946), 

pp. 165 ff. See also T. C. Koopmans, ed.: Statistical Inference in Dynamic 
Economic Models (New York, 1950). 



sume," 


Haavelmo: “Methods of measuring the marginal propensity to con- 
Journal of the American Statistical Association . vol. 42 (1947), pp. 105 ff. 



3.7] 


STATIC MODELS FOR THE TOTAL ECONOMY 


67 


paramaters from which they themselves can be computed. The resulting 
system is as follows: 


(1) 

c = 0.712/ -f 95.05 

(2) 

/• = 0.158(r -f .v)- 34.30 

(3) 

y = c + x — r 


The system is identified if .v is exogenous. Identification is discussed 
in section 6.5 and Chapter 7. The first equation (I) gives the consumption 
function; the marginal propensity to consume 3 is estimated as 0.712. 
From this we compute the so-called multiplier 1 as I /(I — 0.712) - 3.47. 
This shows that for every dollar received in disposable income about 3.5 
dollars will eventually be spent in the economy in the long run. The 
multiplier gives the ratio of total increase in national income to total 
amount of investment. 

The second equation (2) is a business savings equation. It shows the 
relationship between business savings and the sum of gross investment 
and consumers' expenditure. 

The parameters of these two equations have been estimated by statis¬ 
tical methods from the data. It should be emphasized that ihe modifica¬ 
tion of the least squares method used takes the interactions of the various 
relationships in the system into account. What is directly estimated are 
not the parameters of the two equations themselves but only certain 
functions of these parameters. It is, however, possible to derive from 
the estimates of these functions the estimates of the parameters represented 
in the two equations. This would not be feasible if the system was not 
just identified (see section 7.2). 

The third equation (3) is simply a definition of disposable income as 

consumers expenditure plus gross investment minus gross business 

savings. I his definitional equation is exact and nothing in it has to be 
estimated. 

From the system of three equations we can derive expressions for c 


A w - Gilboy: “ The propensity to consume," Quarterly Journal of Econ¬ 
omics , vol. 53 (1938), pp. 120 fT. 

1 R. F. Kahn: “The relation of home investment to unemployment," Econ¬ 
omic Journal, vol. 41 (1931), pp. 173 fT. J. M. Keynes: General Theory of 
Employment , Interest and Money (New York, 1936), pp. 115 fT. J. M. Clark: 
The Economics of Planning Public Works (Washington, D C., 1935). F. Mach- 
lup: “Period analysis and multiplier theory," Readings in Business Cycle Theory 
(Philadelphia, 1944), pp. 203 fT. J. Ullmo: “Une extension de la theorie du 
multiplicateur," Economic app!i(juee y vol. 2 (1949), pp. 321 fT. 



68 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.7 


(consumers’ expenditure), r (gross business savings), and y (disposable 
income) in terms of the exogenous variable * (gross investment, i.e., 
gross capital formation plus government deficit). These equations are: 

(4) c = 1.497* + 298.309 

(5) r = 0.395* + 12.733 

(6) y = 2.102* + 285.576 


For the average of the period in question the values of our variables 
were c = 438.7690, * = 93.5385, y = 492.6920, r = 49.6155. On the 
basis of the three equations given above [(4), (5), (6)] we can construct 
forecasts for r, r, and v if * is given. Since * is gross capital formation 
plus total government deficit, it can always be determined by government 
action in fixing the second quantity. 

Table 1 gives the values of c, r, and y which correspond to selected 
values of*. To illustrate this table we proceed as follows: If, for instance, 


*, Gross 
Investment 

100 

150 

200 


TABLE 1 


r. Consumers' 
Expenditure 

448.009 

522.859 

597.709 


/*, Gross Business 


Saving 


52.232 

71.983 

91.733 


y , Disposable 
Income 

495.776 

600.876 

705.976 


gross investment (gross capital formation plus government net deficit) is 
200 current deflated dollars per capita, then the other variables will 
presumably have the values given in the last line of Table I, if the results 
of the analysis hold. Conditions must be fundamentally the same as in 
the period considered, and the linear relationships which have been 
established statistically must remain valid. Errors in the variables are 
also neglected (section 6.5). Disregarding this objection, consumers' 
expenditure (c) will be about $600 per capita, gross business savings (r) 
a little more than $90 per capita, and disposable income (y) over $700 
per capita. 

Results like this are obviously of some importance for long-range 
governmental planning. But it should be remembered that dynamic 
factors and errors of observations are neglected and that we are evidently 
dealing with a very simplified model of the total economy. It is also 
doubtful whether the assumed linearity of the relationships holds true. 

Dynamic models of a similar type will be discussed in the next section, 

3.8. 



3.8] 


DYNAMIC MODELS OF THE ECONOMY 


69 


3.8 Dynamic Models of the Economy 

These are by far the theoretically and practically most interesting econo¬ 
metric investigations. The pioneer work in this field has been done by 
the Dutch econometrician J. Tinbergen. 1 

Example 1. We lack the space to present Tinbergen’s great dynamic 
model of the American economy, 1919-32. 2 But to illustrate his proce¬ 
dures we will present one of the preliminary models discussed. This is 
the “explanation” of the consumption of pig iron in the United Kingdom, 
1920-36. 3 It is interesting that Tinbergen talks of an “explanation” and 
does not try to identify (section 6.5 and Chapter 7) the economic relations 
which he establishes. For instance, he does not say whether they are 
demand or supply functions. 

The variables are x lt consumption of pig iron; x 2 , profits; .v 3 , price of 
pig iron. These variables are in per cent deviations from the trend. 
x 4 is the long-term interest rate and is measured in absolute deviations 
from its trend, in units of 0.01 per cent. The equation is: 

(0 \.\lx 2t _ l B.24x ; y_Q 5 0.08 x’ 4 ^_ 03 

This equation is derived by classical least squares methods. Tinbergen 
uses very extensively Frisch's bunch map analysis 4 to guard himself 

1 J. Tinbergen: “Annual survey: quantitative business cycle theory,” 
Econometrica , vol. 3 (1935), pp. 241 fT.; An Econometric Approach to Business 
Cycle Problems (Paris, 1937); Les fondements mathematiques de la stabilisation 
des affaires (Paris, 1938); “Critical remarks on some business cycle theories ” 
Econometrica , vol. 10 (1942), pp. 129 ff. C. Clark: “A system of equations 
explaining the U.S. trade cycle 1921-1941,” ibid., vol. 17 (1949), pp. 93 ff. 
R. M. Goodwin: “Econometrics in business-cycle analysis,” in A. H. Hansen: 
Business Cycles and National Income (New York, 1951), pp. 417 ff. E A 

Radice: “A dynamic scheme for the British trade cycle,” Econometrica , vol. 7 
(1939), pp. 47 ff. J. S. Pesmazoglu: “Some international aspects of British 

cyclical fluctuations, 1870-1913,” Review of Economic Studies , vol. 16 (1949) 
pp. 117 ff. 

2 J. Tinbergen: Statistical Testing of Business Cycle Theories, vol. 2: Business 
Cycles in the United States of America , 1919-1932 (Geneva, League of Nations, 

3 J - Tlr| bergen: Statistical Testing of Business Cycle Theories, vol. I: A 

Method and Its Application to Investment Activity (Geneva, League of Nations, 
1939), pp. 34 ff. 

4 R. Frisch: Statistical Confluence Analysis by Means of Complete Repression 
Systems (Oslo, 1934). O. Reiersol: “Confluence analysis by means of instru¬ 
mental sets of variables,” Arkiv for Matematik Astronomi och Fysik , vol. 32A 

o. 4 (Stockholm, 1945); “Confluence analysis by means of lag moments and 
o er methods of confluence analysis,” Econometrica , vol. 9 (1941), pp. 1 ff. 



70 SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH [3.8 

against multicollinearity (section 6.5). The problem of fitting dynamic 
models of this type will be treated in section 10.3.7. 

Consumption of pig iron is the higher the higher profits, the lower pig- 
iron prices, and the lower long-term interest rates. It should be noted 
that in this equation the variable x 2 (profits) appears with a lag of 1 unit 
(i.e., 1 year). The variables x 3 (price of pig iron) and x 4 (long-term 
interest rate) appear with lags of l / 2 unit (i.e., 6 months). These lags 
constitute the dynamic feature of Tinbergen’s model. They lead to 
difference equations which may give rise to solutions with periodic fluc¬ 
tuations. 5 In other words, they can explain what the economist calls 
the business cycle. 

Example 2. In the second volume of his study, Tinbergen analyzed 
data for business cycles in the United States, 1919-32. 6 7 He considers a 
very ambitious system containing 70 variables, some of which enter with 
lags. After careful analysis of the empirical results and elimination of 
all other variables, he obtains for the special case of the absence of a 
stock exchange boom and of hoarding the following equation:' 

(2) Z, = 0.398Z,_, - 0.220Z<_ 2 4- 0.013Z,_ 3 4- 0.027Z, 4 

Z is here the net income of corporations, expressed in standard units. 
This difference equation gives rise to a cycle with a period of 4.8 years. 
The cyclical fluctuations are damped, the ratio of damping being 1.89. 

Since we live in a dynamic economy, static models or models based upon 
a stationary state like the ones presented in the previous section are 
evidently less useful than dynamic models like Tinbergen's. But there 
are still important shortcomings in his analysis. In order for his procedure 
to be really useful for applications in policy it would be necessary to know 
more about the statistical reliability of the results. For certain policy 
applications the equations need to be identified. The problem of identi¬ 
fication will be discussed in section 6.5 and Chapter 7. Since they are 
based upon a rather short period of observations, the results are probably 
not very reliable and stable from a statistical point of view. 8 

5 P. A. Samuelson: “Dynamic process analysis," in H. S. Ellis, ed.: A Survey 
of Contemporary Economics (Philadelphia, 1948), pp. 352 ff. W. J. Baumol: 

Economic Dynamics (New York, 1951). 

6 J. Tinbergen: Statistical Testing of Business Cycle Theories , vol. 2: Business 
Cycles in the United States of America, 1919-1932 (Geneva, League of Nations, 
1939). 

7 Ibid., pp. I 36 ff. 

8 J. M. Keynes: “Review of J. Tinbergen: Statistical Testing of Business 
Cycle Theories , Part I,“ Economic Journal, vol. 49 (1939), pp. 558 ff. J. Tin¬ 
bergen: “On a method of statistical business cycle research: a reply, ibid., 
vol. 50 (1940), pp. 141 ff. J. M. Keynes: “Comment," ibid., pp. 154 ff. 



3.8] 


DYNAMIC MODEIS OF THE ECONOMY 


71 


Example 3. L. R. Klein 9 gives in a remarkable paper a number of 
dynamic models for the American economy. The equations in all his 
models are identified with certain structural economic relationships, of 
which they may be regarded as the estimates. The largest model contains 
29 variables and no less than 16 equations. We will present only one of 
his simpler models. 10 

The data are annual time series from the United States for the period 
1921-41. The variables are C, consumers’ expenditure; /, gross private 
capital formation; G , government expenditure on goods and services; 
K, disposable income; /\ gross national product; T, government receipts 
plus business reserves minus transfer payments minus inventory profits. 
All these quantities are measured in current dollars. The deflator p is 
the cost of living index, 1935-39 = 100. All the quantities are also 
expressed as per capita figures. N is population in the continental United 
States in billions of persons. The equations are as follows: 

(3) C ' 84.74 4- 0.58 K ' + 0.15 Y ' 1 

P< N < Pt N , P, i-V, , 

< 4 > P, C, 4 /, + G, 

<5) Y, + T, = P, 


The first equation (3) is statistically estimated by an adaptation of the 
least squares method (section 7.2). The assumption is that there are 
errors in the first equation, but not in the variables (section 6.5). Equation 
(3) relates consumer spending to current and past income. All quantities 
are deflated and expressed per capita. The first equation (3) may be 
identified with the consumption function. The other two equations are 
definitions and hold exactly. It is possible to derive standard errors of 
the coefficients in the first equation. 

Klein used these equations for a forecast of the national income He 
assumed for 1947: G, 32.8, T, = - 39.52 + 0.61 Y„ Y, , - 138 7 
P (-1 = 1.30, N,_ t = 0.140. 

The forecast equation for Y (disposable income) is: 

(6 > T= 70.32 + 27.51/> + 0.97/ 

By assuming various values for p (cost of living index) and for / (gross 
capital formation), Klein obtains from (6) forecasts for Y (disposable 
income) and for P (gross national product). 


9 .. L „ R - Klem: “ The use of econometric models as a guide to economic 
policy, Econontetrica , vol. 15 (1947), pp. Ml ff 

N , PP ' l3 , See also Economic Fluctuations in the United States 
1921-41 (New York, 1950), pp. 80 ff. 



72 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.8 


Some of the forecasts 
1 and 2. 


taken from Klein’s table are 
TABLE 1 


presented in Tables 


Forecasts for Y (Disposable Income), 1947 


p. Cost of Living Index 

/, Gross Capital Formation 


30 

40 

1.40 

138 

148 

1.50 

141 

150 


TABLE 2 


Forecasts for P (Gross National Product), 1947 

/?, Cost of Living Index 

/, Gross Capital 

Formation 


30 

40 

1.40 

183 

198 

1.50 

187 

203 


These forecasts have also standard errors attached which we have not 
reproduced. The results can be exemplified in the following way: If 
the fundamental assumptions of the simple model of Klein are fulfilled, 
if the cost of living index p is 1.50, if also / the gross capital formation is 
40 billion current dollars, then we can make forecasts as follows: for Y 
(disposable income) 150 billion current dollars and for P (gross national 
product) 203 billion current dollars. The importance of such forecasts 
for economic policy is evident. 

The pioneering efforts of Klein and others deserve our admiration. 
But it is important to point out the limitations of his model and his 
statistical procedures: It is doubtful whether the relationships assumed 
are really linear. This assumption should certainly be tested. The 
neglect of errors of observations and similar phenomena impairs also to 
a certain extent the validity of the results. A method which permits one 
to deal with errors of observations will be presented in section 6.5. It is 
based upon the assumption of the existence of errors in the variables, but 
lack of errors in the equations. 

Klein’s data are time series. It is somewhat doubtful whether the 
mutual interdependence of consecutive observations has been sufficiently 
taken into account. Methods which deal with this difficult phenomenon 

will be presented in Part 3 of this book. 

Example 4. We feel that the “ simple ” theory of business fluctuations 11 

11 G. Tintner: “A ‘simple' theory of business fluctuations," Econometrica, 
vol. 10(1942), pp. 317 ff.; “Une theorie ‘simple' des fluctuations economiques," 



3.8] 


DYNAMIC MODELS OF THE ECONOMY 


73 


has certain theoretical advantages. For purposes of policy it is less useful 
than Klein's model. 

Xet us start with the complete static, i.e., timeless Walrasian system of 
general equilibrium. 12 There are n commodities and services in the 
economy. We have one demand function and one supply function for 
each commodity. All these demand and supply functions depend upon 
all the prices in the economy. If D { is the demand function for com¬ 
modity / (/ = 1, 2, • • • ri) and 5, is its supply function, we have in 
equilibrium: 

( 7 ) ‘ * Pn) = Si (p l9 p 2 , • • • p„) (/ = 1,2,- • • n) 

The quantity p, is the current price of commodity /. This (static) Wal¬ 
rasian system cannot give rise to any kind of fluctuation. 

Now we consider the following generalization: Instead of depending 
upon present prices, the demand and supply functions depend upon 
expected or anticipated prices. 13 

Denoting the expected or anticipated price of commodity / by p h our 
system (7) becomes: 

( 8 ) D < (Pi> P* * ’ * Pn) = -S’, (pi, P 2> ‘ • * P„) (/ " 1, 2, • • • /?) 

How are the anticipations regarding the prices formed? One simple 


Revue cl'economic politique, vol. 57 (1947), pp. 209 fl\ B. Chait: Les fluctuations 
economicjues et /'interdependence lies marchees (Brussels, 1938). R. M. Good¬ 
win: “Dynamic coupling with especial reference to markets having production 
lags,” Econometrics vol. 15 (1947), pp. 181 fl\ A. C. Pigou: Industrial Fluc¬ 
tuations (2nd ed., London, 1920). A. H. Hansen: op. cit ., pp. 362 IT. 

12 G. J. Stigler: Production and Distribution Theories (New York 1949) 
pp. 228 ff. 

13 G. Tintner: “The theoretical derivation of dynamic demand curves,” 
Econometrics vol. 6 (1938), pp. 375 ff.; “Elasticities of expenditure in the 
dynamic theory of demand,” ibid., vol. 7 (1939), pp. 266 ff. H. Aujac: “Les 
modeles mathematiques microdynamiques et le cycle,” Economic app/iquee, 
vol. 2 (1949), pp. 469 ff. R. F. Harrod: The Trade Cycle (Oxford, 1936); 
Towards a Dynamic Economics (London. 1948). S. S. Alexander: “Mr. 
Harrod's dynamic model,” Economic Journal, vol. 40 (1950), pp. 724 ff. J. R. 
Hicks: A Contribution to the Theory of the Trade Cycle (Oxford, 1950). W. J. 
Baumol: Economic Dynamics (New York, 1951). E. Altschul: “Die moderne 
Konjunkturforschung in ihrer Beziehung zur theoretisenen Nationaloekonomie,” 
Schriften des Vereins fur Sozialpolitik, vol. 173/2 (Munich, 1928), pp. 171 ff. 



74 


SOME ILLUSTRATIONS OF ECONOMETRIC RESEARCH 


[3.8 


possible hypothesis is that anticipated prices depend upon present prices 
and the rate of change of prices over time. 14 We have then: 

Pi = PiiPi * Pi) (/ =1,2.- • • n) 

where p i == dpjdt is the rate of change of p f over time. That is to say, 

a rising price is expected to keep on rising, a falling price to keep on 

falling. These assumptions may not be too bad as first approximations. 

It is, however, likely that in reality the relationship is much more compli¬ 
cated. 15 

Making the assumption of linearity, which is certainly not strictly satis¬ 
fied, we are led to a system of linear differential equations with constant 
coefficients: 


(10) A 


l i 


i A i)Pi + 2 Bap) = 0 

j=l j=\ 


0 1» 2, • • • n\ A if A B/j 

are constants.) 

It is remarkable that such a system is capable of explaining at once the 
trend and the business cycle. The nature of the roots X of the deter- 
minantal equation (Appendix A.l): 

(11) | A u -f XB fj | = 0 

is decisive. 

We may get purely exponential terms, which correspond to an increasing 
or decreasing trend proceeding in geometric progression. But we may 
also obtain sinusoidal fluctuations, which may have explosive, constant, 
or damped amplitudes. 16 

The author has made an attempt at a tentative empirical verification 17 
with just three annual American price series. The data are taken for 
the period 1920-42. He used an index of stock prices, an index of farm 
prices, and an index of all other prices. 

The determinantal equation (11), the data for which are derived by the 
classical method of least squares, is here as follows: 



-0.48 - A 

2.16 

-1.29 


(12) 

-0.24 

0.45 - X 

-0.17 

= 0 


-0.07 

0.15 

-0.03 - X 



14 G. C. Evans: Mathematical Introduction to Economics (New York, 1930), 
pp. 122 ff. 

15 G. Tintner: “A contribution to the non-static theory of choice," Quarterly 
Journal of Economics, vol. 56 (1942), pp. 302 ff 

16 F. R. Moulton: Differential Equations (New York, 1930), pp. 246 ff. 

17 G. Tintner: "The ‘simple' theory of business fluctuations: a tentative 
verification," Review of Economic Statistics , vol. 26 (1944), pp. 148 ff. 



3 . 8 ] 


DYNAMIC MODELS OF THE ECONOMY 


75 


One root is real and gives rise to an exponential trend which corresponds 
to an increase of about 4 per cent each year. The other two roots are 
conjugate complex. They correspond to a sinusoidal fluctuation. The 
period of the fluctuations is about 13 years. The amplitudes of the 
fluctuations are damped. The damping ratio is about 1.05 each year. 

These results are too tentative to be taken seriously for practical appli¬ 
cations. The assumptions of linearity are not satisfied in the economy. 
There is no identification of the individual equations in the model. The 
use of the classical method of least squares is not likely to give satisfactory 
results. But it is believed that other investigations with more price series 
and also with improved statistical methods may conceivably help to 
illuminate some of the problems of business fluctuations. The problem 
of stochastic differential equations will be discussed in section 10.4. 

Dynamic relationships existing in the economy will be further illustrated 
with the following examples: section 10.3.5, Examples I and 2; section 
10.3.7, Example 1. 



Chapter 4 



The Practical Importance of Econometrics 


In Chapter 3 we tried to indicate some of the possible uses of econo¬ 
metrics in economic policy. We will consider here very briefly the same 
problem in a more systematic form. The practical importance of econo¬ 
metrics will evidently depend on the state of economic organization and 
more specifically upon the extent of state interference. 

In a complete laissez faire economy 1 there is no state intervention in 
economic life. Hence the only use of econometrics would be in the field 
of private enterprise. Econometrics can evidently be helpful in the plan¬ 
ning of individual firms. Econometric methods may be used in market 
research to estimate and analyze the demand for the products of a given 
firm. Econometric analyses of production conditions may lead to a 
more thorough understanding of production costs and increase the 
efficiency of individual enterprises. 

If the complete laissez faire system is modified so that state intervention 
is permissible in the field of monetary policy , 1 2 the scope of econometrics 
becomes wider. Savings habits, liquidity preference, reaction to given 
interest rates, etc., are all problems which at least in principle may be 
analyzed with the help of econometric methods. In this way the govern¬ 
ment may secure a firmer empirical basis for its monetary policy. Econo¬ 
metric methods may also be used in order to compute indices of business 
conditions. An econometric study of the labor market would give, for 
example, some insight into the state of employment. These result's could 
be important for indications regarding the most favorable economic policy 
which the government should follow. 

Next, let us consider a capitalist economy based entirely (or very largely) 
upon private ownership of the means of production but with extensive 
state control and regulation of industry. 3 Here it will be essential that 


1 L. von Mises: Socialism (London, 1936). 

2 F. A. von Hayek: The Road to Serfdom (London, 1944), pp. 27 ff. H. 
Simons: Economic Policy in a Free Society (Chicago, 1948). E. D. Allen and 

O. H. Brownlee: Economics of Public Finance (New York, 1947). 

76 



4.1] 


THE PRACTICAL IMPORTANCE OF ECONOMETRICS 


77 


the controlling state organs gain as complete information as possible about 
the industries which are subject to regulation (e.g., public utilities and 
large monopolies.) An econometric study of the production and cost 
conditions and also of demand conditions in these industries could 
probably contribute a great deal to a more successful policy in this field. 

Econometric models of the total economy will be needed if there is to 
be rational planning for the use of “functional finance ,” 3 4 in order to 
prevent trade fluctuation. These models may also be helpful in planning 
the government budget. 

We assume next the existence of a mixed economy . 5 There are certain 
industries which are socialized, whereas others are under private manage¬ 
ment. In the public sector of the economy econometric methods may 
be used again for demand and cost studies. These methods may con¬ 
ceivably contribute something to the knowledge necessary for the planners 
and enable them on the one hand to produce more efficiently, on the 
other hand to produce the commodities which are actually in demand by 
the public. This would be especially important if there is “competitive” 
planning, so that the control of various enterprises or industries is to a 


3 A. H. Hansen: Fiscal Policy and the Business Cycle (New York, 1941). 
E. S. Mason: “Price inflexibility,” Review of Economic Statistics , vol. 20 (1938), 
pp. 53 ff. J. K. Galbraith: “Monopoly and the concentration of power,” in 
H. S. Ellis, ed.: A Survey of Contemporary Economics (Philadelphia, 1948), 
pp. 99 ff. A. H. Hansen: Monetary Policy and Fiscal Policy (New York, 1949). 
T. Suranyi-Unger: Private Enterprise and Government Planning (New York, 
1950). B. H. Higgins: Public Investment and Full Employment (Montreal, 
1948); “Keynesian economics and public policy,” in S. E. Harris, ed.: The 
New Economics (New York, 1948), pp. 468 ff. A. M. Henderson: “The pricing 
of public utility undertakings,” Manchester School , vol. 15 (1947), pp. 223 ff. 

4 A. P. Lerner: Economics of Control (New York, 1944). 

5 O. Lange: “On the economic theory of socialism” in B. Lippincott, ed.: 
On the Economic Theory of Socialism (Minneapolis, Minn., 1938), pp. 90 ff. 
See also A. Bergson, “Socialist economics,” in H. S. Ellis, ed.: A Survey of 
Contemporary Economics (Philadelphia, 1948), pp. 412 ff. F. A. von Hayek: 
“Socialist economics: the competitive solution,” Economica , new series, vol. 7 
(1940), pp. 125 ff. V. Wyckoff: The Public Works Wage Rate and Some of 
Its Economic Effects (New York, 1946). D. Rockefeller: Unused Resources 
and Economic Waste (Chicago, 1941). J. E. Meade: Planning and the Price 
Mechanism (London, 1948). T. W. Schultz: Agriculture in an Unstable 
Economy (New York, 1945); Production and Welfare in Agriculture (New 
York, 1949). D. J. Dewey: “Occupational choice in a collectivist economy,” 
Journal of Political Economy , vol. 56 (1948), pp. 465 ff. 



78 


THE PRACTICAL IMPORTANCE OF ECONOMETRICS [4.1 

certain extent decentralized and they bid against each other on the markets. 

If there is consumers sovereignty, demand and family budget studies mav 

contribute to a more efficient and complete satisfaction of demand by the 
various state enterprises. J 

This type of economy will, however, have to be supplemented by a 

certain amount of central planning* Here econometric models for the 

total economy could be very helpful in facilitating the coordination of 

the various plans and also in carrying out the necessary fiscal and monetary 

measures which supplement the plans of the various individual planning 
authorities. r & 

It is somewhat doubtful whether there ever was a case of completely 

central,zed planning, and whether it is really possible in a complex modern 

economy to concentrate all decisions in one single body. But if we have 

substantially centralized planning 7 then it seems that econometric methods 

gain in importance compared with their significance for other forms of 

economic organization. If the dictator or the planning authority neglects 

completely the wishes of the consumers, then there is of course no need 

to make econometric studies of demand. In so far as consumption 

affects the efficiency of the workers, however, some studies of this nature 

would seem desirable, even if the wishes of the consumers are otherwise 
neglected. 

Centralized or almost centralized planning will necessitate a close co¬ 
ordination of all production plans. It seems reasonable to expect that 
this coordination may be helped by the type of information which can 
be gained by carrying out econometric investigations into cost and 
production conditions of various industries and enterprises. These may 
then be integrated in some way into a central plan. 8 It seems likely that 


G. Bienstock, S. M. Schwartz, A. Yugow, A. Feiler, J. Marschak: Manage¬ 
ment in Russian Industry and Agriculture (New York, 1944). M. H..Dobb: 
Soviet Economic Development since 1917 (New York, 1948). 

O. Lange: op. cit. See also “Foundations of welfare economics," Econo- 
metrica , vol. 10 (1942), pp. 215 ff. P. M. Sweezy: Theory of Capitalist Develop¬ 
ment (New York, 1942), pp. 52 ff. O. Lange: “The practice of economic 
planning and the optimum allocation of resources," Econometrica , vol. 17 

(1949), supplement, pp. 166 ff. F. Perroux: “Les macrodecisions," Economic 
appliquee , vol. 2 (1949), pp. 321 ff. 

F. A. von Hayek: Collectivist Economic Planning (London, 1947), pp. 
208 ff. A. M. Henderson: “Prices and products in state enterprise," Review 
of Economic Studies , vol. 16 (1948), pp. 13 ff. E. Barone: “The ministry of 

production in a collectivist state," in F. A. von Hayek, ed.: Collectivist Economic 
Planning (London, 1947), pp. 245 ff. 



4.1] 


THE PRACTICAL IMPORTANCE OF ECONOMETRICS 


19 


in such an economy new economic problems of a yet unknown nature 
will arise and that new econometric methods, possibly largely inspired 
by accounting, 9 will have to be devised in order to deal with them 
rationally. 

y R. Stone and J. E. Meade: National Income and Expenditure (Cambridge, 
1948). J. Marczewski and G. Th. Guilbaud: “Essai d'analyse graphique d'une 
comptabilite nationale," Economic appliquee , vol. 2 (1949), pp. 138 ff. J. B. D. 
Derksen: A System of National Bookkeeping Illustrated by the Experience of the 
Netherlands (Cambridge. 1946). J. Tinbergen and J. B. D. Derksen: “Recent 
experiments in social accounting: flexible and dynamic budgets," Econometrica, 
vol. 17 (1949), supplement, pp. 195 ff. A. Smithies: “The effect of the role 
of government on international comparisons of national income," ibid., pp. 
242 ff. J. Durbin: “Les equivalants a la somme de transactions," Economic 
appliquee , vol. 3 (1950), pp. 165 ff. F. Perroux: “Les nationalisations et la 
comptabilite nationale," ibid., vol. 2 (1949), pp. 59 ff. P. Boschan: National 
Income: A Cross Section View (New York, 1950). R. Stone: The Role of 
Measurement in Economics (Cambridge, 1951), pp. 38 ff. 




Part 2 


Introduction to Multivariate 
Analysis 


This second, more technical part, is supposed to introduce the reader 
or student to some of the more promising techniques which have been used 
in econometric research in recent years. A moderate knowledge of 
differential and integral calculus and the elements of matrix calculus and 
of modern statistics is assumed. Some ideas on matrix calculus and 
related subjects and on numerical methods are presented in the Appendix. 
The analysis of time series will be treated in Part 3. 

We start by discussing multiple regression and correlation, which 
provide an introduction to more complicated forms of multivariate 
analysis (Chapter 5). In Chapters 6 and 7 we present various forms of 
multivariate analysis which are of interest to the econometrician. 




Chapter 5 



Multiple Regression and Correlation 1 


5.1 Elements of Multiple Regression 

Assume that we want to predict the variable X 1 if specific values of the 
variables X 2 , X 3 , • • • X p are given. X 1 is the dependent variable, the 
predicand. X 2 • • • X p are the independent variables or predictors. We 
assume a linear relationship: 


(1) X\ — k 0 4- k 2 X 2 4- k 3 X 3 -f • • • -f- k p X p 

Sometimes we want also to get estimates for the regression coefficients 
^ 2 > ^3> ^he constant k 0 , which in a hypothetical infinite popu¬ 

lation determine the predicted value of X lt X x as a linear function of the 
predictors X 2 , X 3 , • • • X p . 

Assume that we have A observations on the values of X x • • • X . 
Denote a specific observation by X it (/ = 1, 2, • • • p; / = l, 2, • • • A). 

Further, let X { =^X it /N (/ = 1, 2, • • • /?) be the arithmetic mean of 

AV Denote by the deviation of X it from its arithmetic 

mean. Further let: 


( 2 ) 

be the sums of 
the formula: 


N 

^ij = 2 X it X jt 
t = 1 

squares and cross 


(i>j = 1. 2, • • • /?) 

products. These may be computed by 






2 *«x 

/=i 




- A Jf, 


The use of the method of least squares is indicated because it gives the 
best, unbiased linear estimates of X x according to the Markoff theorem. 2 


on ?' K u d n ] ' ThC A d Va r Ced The ° ry °f S,a '<s>'cs, vol. 1 (London, 1945), 
DO 301 J Mathematical Methods of Statistics (Princeton, 1946), 

n? t: , • , S ' S ' Wllks - Mathematical Statistics (Princeton, 1943), pp. 157 ff 
M Ezekiel: Methods of Correlation. Analysis (New York 1941 ) A A 
Tschuprow : The Mathematical Theory of Correlation (London’ 1925) 

- F. N. David and J, Neyman: “Extension of the Markoff iheorem on least 
squares, Statistical Research Memoirs , vol. 2 (1938) pp 105 ff Mr- 

Kendall: v 2 o«—, pp. 26 t! F nZ.Z. Ldlfc 

Theory for Statistical Methods {Cambridge, 1949), pp. 161 ff. 

83 



84 


MULTIPLE REGRESSION AND CORRELATION 


[5.1 


The Markoff theorem holds under conditions which do not imply a 
normal distribution or even independence (section 10.5). 

The estimates must be linear ; i.e., the parameters to be estimated (ip 

our case the values of k 0 , k 2 , • • • k p ) must enter in a linear fashion. But 

the observations need not enter as linear functions. The Markoff theorem 

would still hold true if we substituted, for instance, log X ( or sin X, etc. 
for X,. *’ 

The linear estimates are unbiased in the sense that the mean values 
(mathematical expectations) of the estimates are equal to the population 
values. The mathematical expectation of a random variable * will be 
denoted by Ex. Let us assume that we make many estimates of a para¬ 
meter Ar, and take the average of these estimates. Then the probability 
that the mean of all the estimates will differ from the population value 
of kt by a given number (which may be arbitrarily small) can be made to 
come as closely to 1 (certainty) as desired, just by increasing the number 
of estimates which are being averaged. 

By best linear estimates we mean the following: The least squares esti¬ 
mates have the smallest variance or standard error among all linear 
unbiased estimates. 

If the condition of independence of the residuals is not fulfilled there 
arise certain complications. The method of least squares has to be 
modified, as Aitken 3 has shown. We will deal with some of these ideas 
in section 10.5. 

The sum of squares to be minimized is: 

* N 

(4) Q = I(X U - X u f = Z(X lt - k 0 - k 2 X 2l - k 3 X 3l - ■ • • — k p X pt ) 2 

t= 1 t = 1 

Minimizing this expression (4), leads to the following set of linear equa¬ 
tions, which are called normal equations : 



‘^ 22^2 4 ~ ^ 23^3 4 " ' * 4 “ ^2pk p — * 5*12 

*^ 32^2 4 ~ ‘$ 33^3 4 “ * * ~t~- S^p/Cj, = S 13 


*Sp2^2 4" ^ p3 k 3 “b * * ’ 4- S vp k p = S lp 

We derive from this set of equations the estimates of the regression coeffi¬ 
cients k 2 • • • k p . 


3 A. C. Aitken: “On least squares and linear combination of observations,” 
Proceedings of the Royal Society of Edinburgh, vol. 55 (1935), pp. 42 ff. 




SI] 


ELEMENTS OF MULTIPLE REGRESSION 


85 


Let c tf denote the elements inverse of the matrix (Appendix A.I) used 

in the normal equations (5). Then k t can also be computed by the 
formula : 


( 6 ) 


C i2^\2 T C i3*^13 ~f~ 


C i p‘^1 p 


The constant k 0 is computed by the equation: 


(?) 


0 T 


+ k 


p*p 




It is determined by the condition that the best fit of X t must go through 

the means of all the variables. This result also follows from the method 
of least squares. 

It should be emphasized that the method of multiple regression is 
designed for the purpose of predicting X t iT X 2 ■ ■ ■ X p are given. It is 
not, in general, appropriate for the estimation of the values of the constant 

and the regression coefficients k 2 ■ ■ ■ k„ as they exist in the hypothetical 
infinite population corresponding to the sample. 

There are special situations in which the method indicated above yields 
at once the best prediction of X x and also the best estimates of the constant 
and the regression coefficients. This is the case if we have a population 
m which only X, is a random variable (subject to errors) and the inde¬ 
pendent variables X., ■ ■ ■ X„ are constants (not subject to error). This 
is a situation in which we have errors in the variable X t . We may, for 
instance, have errors of observations which alTect this variable but not the 
others. Lrrors in the variables will be discussed in section 6.5. 

Another case in which the method indicated above yields valid estimates 
of the population regression coefficients is the following: Assume that 
apart from the variables X., ■ ■■ X„ which appear in the regression 
equation, there are other variables * pfg , ■ • ■ which exert an 

influence on X, but which are not included in the regression equation 
he reason for this omission may be that we have no data pertaining to 
hese variables, that their existence is not even suspected, that there are 
too many and the influence of a single one is negligible etc 

The situation discussed above gives rise to errors in the regression 

regarding Si " g ' e CC l Ualion - Under “"ain assumptions 

regarding these errors the method or least squares applied to the single 

SYl y ' eld I' 6 “ bCS r 6St “ *e Session coeffiS 

Chapter 7 P °P U,a,IOn - Errors in ‘he equations will be treated in 

SV ® U ' tb ' S meth ° d is lon g er vali d ^ our equation belongs to a whole 
y equations, all of which contain errors This case (nrohlem r.f 

identification) will be discussed in section 6.5 and Chap ^ 



86 


MULTIPLE REGRESSION AND CORRELATION 



5.2 Distributions 


All the subjects treated above belong to the class of point estimation. 
If we want to form an idea about the success of our linear fit, if we want 
interval estimates, if we want to test hypotheses or make tests of signifi¬ 
cance, we have to make more definite assumptions about the distribution 


of the deviations or errors: X lt — X lt . 

We proceed from the same fundamental data as in the previous section 
5.1 on multiple regression. Assume now that we have N values of the 
dependent variable A\, say, X lt (t = 1, 2, • • • N) which are samples 
from a normal population. Assume again that we have in the population 
which corresponds to our sample a linear relationship between X 1 and 
the independent variables or predictors X 2 • • • X p : 


(1) X lt — k 0 -f k 2 X 2t + k 3 X 3t -f- • • • -f- k v X vl (t = 1,2,* • - N) 


Let us apply the method of maximum likelihood. That is to say, we will 
choose the constant k 0 and the regression coefficients k 2 - • • k p in such 
a fashion as to make the probability density of the particular sample 
X u • • • X 1N a maximum. 1 


The method of maximum likelihood has certain optimum properties 
which recommend its use to the statistician. Maximum likelihood esti¬ 
mates are consistent . 2 That is to say, the estimate converges in probability 
to the true or population value, as the sample becomes larger and larger. 

By convergence in probability 3 the following is meant: It is almost 
certain (the probability is as near to 1 as desired) that the difference 
between the estimate and the population value will be as small as we like, 
if the size of the sample increases indefinitely. It is very unlikely that 
in large samples there should be any large difference between the estimate 
and the population value. 

For large samples the maximum likelihood estimates also tend under 
certain conditions to become normally distributed. They are efficient 4 
in the sense that they have in the limit (for large samples) a smaller 
variance or standard error than any other estimate. They also are 


1 H. Cramer: Mathematical Methods of Statistics (Princeton, 1946), pp. 
498 ff. S. S. Wilks: Mathematical Statistics (Princeton, 1943), pp. 136 ff. 
R. A. Fisher: “On the mathematical foundations of theoretical statistics." 
Transactions of the Royal Society in London , series A, vol. 222 (1922), pp. 309 ff., 
Contributions to Mathematical Statistics (New York, 1950), Paper 10. O. 
Anderson: Einfuehrung in die mathematische Statistik (Vienna, 1935), pp. 262 ff. 

2 H. Cramer: op. c/7., pp. 489 ff. S. S. Wilks: op. c/7., pp. 33 ff. 

3 H. Cramer: op. c/7., pp. 253 ff. S. S. Wilks: op. c/7., pp. 81 ff. 

4 H. Cramer: op. c/7., pp. 487 ff. S. S. Wilks: op. c/7., pp. 134 ff. 



5 . 2 ] 


DISTRIBUTIONS 


87 


sufficient 5 if a sufficient estimate exists. That is to say, they exhaust all 

the information regarding the population value available in the sample. 

It can be shown that the method of maximum likelihood leads again 

to the method of least squares. The probability density of a specific 
value of X lt is: 


( 2 ) 


Pt = 


1 


oV2tt 


, 2 ■ ■ ■ -k p X p ,)' 


0 = 1 , 2 , 


N) 


where a 2 is the population variance of each of the X u . The values 
-*it, X 12 , ■ ■ ■ X lx are supposed to be independent. Hence the probability 

of specific values of X n , X 12 , ■ ■ ■ X 1N appearing together is simply the 
product of all the p, : 

p =Pi'Pz’ ’ ■ Pn 

(3) 1 

2o* V 


o n (2tt) n ' 2 

where Q has the same value as in formula (4) of 5.1. The values of the 
constants k 0 , k 2 , ■ ■ ■ k v which are to be estimated appear only in O. 
Evidently P will be the larger the smaller Q is. Hence for a maximum 

°[ ^ 1 We j haVC t0 find the rninimu m of Q. The method of maximum 
likelihood leads to the method of least squares. 

The maximum likelihood estimates of the constant k 0 and of the re¬ 
gression coefficients k 2 ■ ■ ■ k v are determined from the normal equations 
as indicated in the previous section. The normal equations are given 

in formula (5) of section 5.1. 6 

In order to test the significance of the linear relationship we compute 
the multiple correlation coefficient .• This is the simple correlation co¬ 
efficient between X x and X,: 


(4) 


R 2 — 

^ 1.23 •••»*”“ 




+ *A 


^ii 


To test the multiple correlation coefficient we have to choose a level 

canh p n H h C h S3y 5 Per ° r * PCr Cen ‘- The t0tal var,ance of X x 
can be divided into two independent sums of squares: 

(5) 


11 


N 

/= 1 


X u ) 2 + k 2 S 12 4 k 3 S 13 + • • • + k p S l 


e s ° P C> '- PP ' 488 ^ S - S - WHks: °P- ci '- PP- >35 ff. 

indii de^o re.atl "f * M ‘ “ AnCie " S et — eaux 

ae correlation, Economeinca , vol. 15 ( 1947 ), pp. 410 ff. 



88 multiple regression and correlation 

The first sum in (5) is the sum of squares of the deviations from the regres¬ 
sion function. The second is the sum of squares of the regression function 
itself. The first is distributed like x 2 with N—p degrees of freedom. 

The second is distributed independently of the first like y 2 with p - I 
degrees of freedom. 

Hence the ratio 



( 6 ) F= ^2^12 ^3^13 ‘ ’ * + k p S lv R\N — p) 

s n ~ k 2 S 12 — k 3 S 13 — • • • — k 9 S lv (1 — R?)(p — lj 

is distributed like Snedecor’s Fwith p — 1 and N — p degrees of freedom. 

It should be remembered that p is the total number of variables, both 
dependent and independent. 

We test the null hypothesis that in the population the multiple corre¬ 
lation coefficient is zero. If the null hypothesis is rejected, from the 
point of view of the level of significance chosen in advance, we may 
consider the multiple correlation as significant. In this case there is 
probably a linear relationship between the variables in the population 
which corresponds to the sample. Tables which provide for a direct 
test of the multiple correlation coefficient at various levels of significance 
are given by Snedecor. 7 

Another test is provided for the individual regression coefficients k 2 , k 3 , 
k p * Assume that we want to test the hypothesis that the population 
value of k { = k { (i = 2, 3, • • • p ). 

We form the quantity: 




•Sn k 2 S l2 k 3 S l3 




■•Sip 


which is the sum of squares of the deviations of the observations from 
the multiple regression equation estimates divided by their appropriate 
number of degrees of freedom. Then the test function: 



is distributed like Student’s / with N — p degrees of freedom; c it is the 
diagonal element of the inverse matrix which was defined above in section 
5.1. Having chosen in advance a level of significance, say 1 per cent or 

5 per cent, we can test the hypothesis that in the population k ( = k t . 

More specifically, we may put k t = 0. Then we are testing the null 
hypothesis that Ar, = 0 in the population corresponding to our sample. 


7 G. W. Snedecor: Statistical Methods (4th ed., Ames, Iowa, 1946), pp. 351 ff. 



5.3] 


A TEST FOR LINEAR RELATIONS 


89 


This is the same as the hypothesis that in the population there is no (linear) 
relationship between X l and X { . If the null hypothesis is rejected we 
will consider k i significant (i.e., significantly different from zero) on the 
basis of the level of significance adopted. 

A similar test is provided for the constant k 0 . We may test the hypo¬ 
thesis that in the population k 0 = k 0 . Then we form the quantity: 

(9) ( = (*o - k 0 )\ N- p 

s 

This is again distributed like Student’s / with N — p degrees of freedom. 
Given a level of significance, we can test the hypothesis that k {) = k 0 . 
We can also test the null hypothesis that k 0 = 0. 

We may also regard the independent variables or predictors X 2 , X 3 , 
* ' ' X p as normally distributed. We are then led to the consideration of 
a multivariate normal distribution. The Wis hart distribution 8 deals with 
the joint distribution of the sample variances and covariances a tj derived 
from such a normal multivariate distribution. Here a tj = S tj /(N — 1). 
We divide by N — 1 because we want our sample estimates of the variances 
and covariances a to be unbiased estimates of the population variances 
and covariances. This distribution is fundamental for all distribution 
theory in multivariate analysis. 


5.3 A Test for Linear Relations 


Another test is sometimes required in econometric investigations for 

the following hypothesis: It is assumed that the weighted sum of the 

regression coefficients in the population is a given number. 

We follow then a procedure given by Wilks. 1 We compare the sum 

of squares of the deviations from the regression equation fitted by the 

method of least squares without the restriction with the sum of squares 

of the deviations from the other regression equation fitted with the 
restriction. 

Denote the new regression coefficients which are to be fitted under the 
linear restriction by K 0 , K 2 , K 3 , ■ ■ • K p . Then we have to minimize the 

sum of squares of deviations, 

.v 




K*ii- K 0 - k 2 x 2( 
/= 1 


* 3 * 3 ,- 



8 j. Wishart: “The generalized product distribution from samples from a 
normal multivariate population,” Biometrika , vol. 20A (1928), pp. 32 fT. 

S. S. Wilks: Mathematical Statistics (Princeton, 1945), pp. 124 ff. I O 
Irwin: “Mathematical theorems involved in the analysis of variance,” Journal 
of the Royal Statistical Society , vol. 94 (1931), pp. 284 ff. 



90 


MULTIPLE REGRESSION AND CORRELATION 


[5.3 


under the condition that the new regression coefficients fulfil the following 
linear relationship: 6 

(2) + o z K 3 + • • • 4- a p K v = m 

The numbers a 2 , a 3y • • • < 7 P are the constant weights and are assumed to 
be given. The constant m is also given. 

It follows from the theory of restricted maxima and minima that the 
constrained minimum of Q 2 is the same as the minimum of the function 

^ F = 02 + Hfl 2^2 + a 3 K 3 + a p K p — m) 

where A is a constant, the so-called Lagrange multiplier. 

The “normal equations” necessary to determine K 2 • • • K p and A are 
now: 

^ 2^22 4- K 3 S 23 4- * * • 4- K P S 2p 4- Xa 2 = S 12 

^2 S 23 4- ^ 3^33 4- • * • + A ' p S 3p 4- Afl 3 = S 1 3 

(4) . 

A^ 2 ^ 2 p 4- ^ 3 ^ 3 ,, 4- * * * 4- K P S PP 4- Aa p = S lp 

K 2 a 2 4- K 3 a 3 4- * * • 4- K p a p = w 


This set of equations is very similar to the ordinary normal equations of 
the classical theory [formula (5) of section 5.1] and may be solved by 
similar methods. 

The sum of squares of the residuals becomes now: 

(5) Q 2 = S 11 K 2 S 12 * 3 S 13 — • • • — A v S lp — mX 

This is distributed like % 2 with N — p + \ degrees of freedom. 

It will be recalled that the sum of the squares of the residuals for the 
regression equation fitted without the restriction was 

( 6 ) Q\ — Sn k 2 S l2 k 3 S 13 * — k p S lp 

where the k , are determined from the normal equations given above, 
formula (5) of section 5.1. The quantity Q ly it will be recalled, is distri¬ 
buted like x 2 w ith N — p degrees of freedom. It can also be shown with 
the help of Cochran's theorem 2 that the two sums of squares Q x and Q 2 
are independent. 

We want to test the hypothesis that in the population Q 1 = Q 2 . Thai 
is to say, the hypothesis is that in the population there is actually a linear 


2 W. G. Cochran: “The distribution of quadratic forms in a normal system, 
with applications to the analysis of covariance,” Proceedings of the Cambridge 
Philosophical Society , vol. 30 (1934), pp. 178 ff. 




5 . 4 ] 


PARTIAL CORRELATIONS 


91 


relationship between the regression coefficients, such that their weighted 
sum is equal to m. 

We form the test function: 


(7) F — P ) 

Qx 

This is distributed like Snedecor's F with 1 and N — p degrees of freedom. 
We judge the validity of the hypothesis again from the point of view of a 
given level of significance which ought to be chosen in advance. 

Example I. A production function 3 which is linear in the logarithms 
has been fitted for 609 Iowa farms for 1942. (See section 3.4, Example 2.) 
The dependent variable was the logarithm of the product. The inde¬ 
pendent variables were the logarithms of the following factors: land, 
labor, improvements, liquid assets, working assets, cash operating ex¬ 
penses. The multiple correlation coefficient was 0.821. It is significant 
at the 1 per cent level of significance. The sum of the squares of the 
deviations from the fitted regression line was Q x - 7.973. 

To test the hypothesis that the production Junction is a linear homogeneous 
Junction , i.e., a homogeneous function of degree I in the factors of pro¬ 
duction, we impose the conditions that the sum of the regression coeffi¬ 
cients is equal to 1. The sum of the squares of the deviations from the 
function fitted under this restriction was Q 2 = 8.044. We test the null 
hypothesis at the level of significance of 1 per cent. 

From the above formula (7) we have F 5.3625, which is distributed 
with 1 and 602 degrees of freedom. The number of degrees of freedom 
was not given correctly in the paper quoted above. 

At the I per cent level of significance the permissible value of F is 6.64. 
Hence the empirical F is not significant. The hypothesis that there is 
a linear homogeneous production function need not be rejected. This 
result is of far-reaching importance in economics. It means that there 
are probably no economies or dis-economies of large-scale production. 

It should be remembered, however, that one important factor of 
production was not included in our production function. This very 
scarce factor is management or entrepreneurship. It seems likely that 
this factor determines ultimately the size of the enterprise. 


5.4 Partial Correlations 

Suppose that we want to measure the relationship between X, and X . 

after the influence of another variable, say X* has been elimioated from 
both the variables X x and X 2 . 


recn?d'J‘T er: " A n ° ,e °" the derivation ot ' Production functions from farm 
records, Leo no me trie a, vol. 12 (1944), pp. 26 ff. 



92 


MULTIPLE REGRESSION AND CORRELATION 


[5.4 


Denote by r a the simple correlation between X, and X,. The partial 
correlation coefficient becomes 



r \2 r l3 r 23 

Vl - r i3 *V i - r 2 / 


To test a partial correlation coefficient for its significance we use the 
distribution of the simple correlation coefficient. But the /-table has to 
be entered for N — 2 — q degrees of freedom, where q is the number of 
variables held constant. In our case, just one variable (X 2 ) is held 
constant. Hence q = 1. 



Chapter 6 


Some Applications of Multivariate Analysis 
to Economic Data 


This chapter proposes to introduce the economic statistician to some of 
the newer methods of multivariate analysis. 1 The emphasis will be on 
methods of estimation and not on tests of hypotheses. Tests of signifi¬ 
cance will be indicated where they have been established. 

Estimation by means of multivariate analysis presents certain general¬ 
izations of the methods of multiple regression. 2 These analogies will 
be emphasized later in the course of the discussion of the various pro¬ 
cedures. The formal relationships between various types of multivariate 
analysis have been treated elsewhere. 3 We will here deal only with the 
following methods: 

(1) Discriminant analysis (section 6.2). We propose to determine 
linear functions or “indices” computed from various measurable charac¬ 
teristics of certain data. The data have been classified into two groups. 
Discriminant analysis tries to establish linear functions of the character¬ 
istics which are such that they distinguish most successfully in a certain 
sense betweer these groups. This method was invented by R. A. Fisher. 
A test of significance utilizes earlier work of Harold Hotelling. 


1 A summary of some of the methods is given in the following: S. S. Wilks: 

Mathematical Statistics (Princeton, 1943), pp. 252 ff. M. G. Kendall: The 
Advanced Theory of Statistics, vol. 2 (London, 1946), pp. 328 ff. M. S. Bartlett: 
“Multivariate analysis," Journal of the Royal Statistical Society , supplement, 
vol. 9 (1947), pp. 175 ff. "Internal and external factor analysis," British Journal 
of Psychology, statistical section, vol. 1 (1949), pp. 73 ff. G. Tintner: "Some 
applications of multivariate analysis to economic data Journal of the American 
Statistical Association , vol. 41 (1946), pp. 572 ff. W. G. Madow: "Contribu¬ 
tions to the theory of multivariate statistical analysis," Transactions of the 
American Mathematical Society , vol. 44 (1938), pp. 454 ff. 

3 Ezekiel: Methods of Correlation Analysis (2nd ed.. New York. 1941). 

G. Tintner: "Some formal relations in multivariate analysis," Journal of 
the Royal Statistical Society, series B, vol. 12 (1950), pp. 95 ff. 

93 



94 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.1 


(2) Principal components (section 6.3). Here we try to answer the 
following question: Is it possible to analyze a set of variables into a 
more fundamental set of components (“factors”)* possibly fewer in 
number? Which portion of the total variance can be accounted for by 
each component? The best method in this field is due to Harold 
Hotelling. 

(3) Canonical correlation (section 6.4). Assume that we have two 
sets of variables. How can we determine linear combinations (“indices”) 
of the variables in each set in such a fashion that the correlation between 
the indices becomes a maximum? This method is due to Harold 
Hotelling. 

(4) Weighted regression (section 6.5). Assume that we have a set of 
variables all of which are subject to disturbances (“errors”). How can 
we find weighted linear regression functions which will give us in a certain 
sense the “best" estimates of the weighted regression coefficients? This 
method, evidently closely related to classical multiple regression analysis, 
is in its present form due to Tjalling C. Koopmans. It can also be used to 
answer a question previously raised by Ragnar Frisch: How many linear 
relationships probably exist between the systematic parts of the variables 
in the population which corresponds to the sample (multicollinearity)? 

In what follows we propose to discuss these methods briefly and with 
a uniform notation. We will try to avoid presentation of numerical 
methods. These will be presented in Appendix A.2. No effort has 
been made to give a complete survey of the literature. Some related 
problems will be treated in Chapter 7. 

Some examples previously given by other authors will be summarized, 
and new examples will also be presented. It should be remembered that 
these examples are only tentative applications of the various methods 
and should be regarded merely as illustrations. It is to be hoped that 
they will stimulate more extensive applications in the economic field. 

The data which we use in our examples are time series. But we have 
neglected almost entirely this particularity of the data and the difficulties 
connected with it. 4 Possibly this introduces some biases into the tests 
of significance because of the autocorrelations 5 probably existing in the 
data (section 10.1). The problem of degrees of freedom in economic 


4 See G. Tintner: “The analysis of economic time series," Journal of the 

American Statistical Association , vol. 35 (1940), pp. 93 flf. 

5 R. L. Anderson: “Distribution of the serial correlation coefficient. Annals 
of Mathematical Statistics , vol. 13 (1942), pp. 1 ff. 



6 ./] 


NOTATION 


95 


time series has been treated by H. T. Davis. 6 No use has been made of 
these and similar methods. Presence of autocorrelation makes the esti¬ 
mates inefficient. They remain, however, consistent estimates. The loss 
of efficiency is not very considerable if the autocorrelation is not too large. 
It should be remembered, however, that the tests of significance, where 
they are given, may be influenced by existing autocorrelation in the vari¬ 
ables. Some problems connected with this difficulty will be discussed in 
Part 3. 

Another obvious shortcoming of the methods presented below is the 
fact that they all assume essentially linear relationships existing in the 
population corresponding to the sample. It is to be hoped that this 
difficulty can be overcome later and that analogous methods will be 
developed to deal with non-linear cases. We may, for instance, use 
squares, cross products, and higher powers of the variables. 


6.1 Notation 

Throughout this chapter we will carry on our argument in terms of the 
sample, but always with the end in view of establishing estimates for the 
relationships existing in the population corresponding to the sample. 

Let X it (/ = I, 2, • * • p\ t = I, 2, • • • N) be a set of random vari¬ 
ables. The observations in this sample correspond to a normally distri¬ 
buted multivariate population. We assume that each of the variables 
X x • • • X p has been observed at N points, t = 1,2,* • • N. 

Denote by 

(1) x, = I ^ (/= 1, 2. • • ■ P) 

(= 1 /V 

the sample means of the p variables. Then: 

( 2 ) = X lt — X, (/ = 1, 2, • • • p: t = 1, 2, • • • N) 


are the deviations from the means. The sums of squares and products are: 
(3) S„ = | x„x u (i,j = 1, 2, • • • />) 

t - 1 

and the sample variances and covariances : 



We divide by N — 


1 in order to get unbiased estimates of the population 


8 H. T. Davis: Analysis of Economic Time Series (Bloomington Ind 1941) 

PP- 175 ff. 



96 APPLICATIONS OF MULTIVARIATE ANALYSIS [6.2 


variances and covariances. The sample correlation coefficient between 
X( and Xj is: 

( 5 ) r„ = °2h= (/, j = 1 , 2 , • • • p) 

V a • ci ' 

V L4 n L4 

Finally the standardized variables, deviations from the sample means 
expressed in terms of their standard deviations Va ii9 are 

(6) z it = (/ = 1, 2, • • • p; t = 1, 2, • • • N) 

va h . 


6.2 Discriminant Analysis 

The first method discussed here is the method of discriminant functions 

introduced into statistics by R. A. Fisher. 1 

The problem to be solved is the following: Assume that we have a set 
of measurements of a number of variables which are classified into two 
groups. Which linear combination of the various measurements will in 

a certain sense best disc.iminate between the two groups? 

Assume that we have N normally distributed observations on p variables 
X, which we denote by X it (/ = 1, 2, • • • p\ / = 1, 2, • • N). Classify 
these into two groups for t = 1, 2, • • • N x and t = N x + 1, N x + 2, 
. . . 4 . yv 2 = N. We define the means in each group: 





(/ =1,2,- • • p) 


Let the difference of the means be: 



We want to find the linear function of the differences of the means: 



Z = Mi + M 2 + * * ’ + Mp 


1 R A Fisher: “The use of multiple measurements in taxonomic problems. 
Annals of Eugenics, vol. 7 (1936), pp. 179 ff. See also “The statistical utilization 
of multiple measurements," ibid. , vol. 8 (1938), pp. 376 ff., Coniribuuons to 
Mathematical Statistics (New York, 1950), Papers 32 and 33; S,a, ‘ s '' ca J 
Methods for Research Workers (8th ed„ London, 1941), section 49.2 pp. 279 ff. 
G W. Brown: “Discriminant functions," Annals of Mathematical Statistics, 
vol 18 (1947) pp. 514 ff. P. G. Hoel: Introduction to Mathematical Statistics 
,New York. 1947), pp. 121 ff. C. R. Rao: “On some problems ansmg out of 
discrimination with multiple characters," Sankhya, vol. 9 (1944), pp. 3 . 

"The utilization of multiple measurements in problems of biological class ' f ' c ^ 
lions " Journal of the Royal Statistical Society, series B, vol. 10 (1948), pp. 5V rt. 



6 . 2 ] 


DISCRIMINANT ANALYSIS 


97 


which discriminates most successfully in a certain sense between the two 
sets of variables. Its square should be a maximum relative to its variance. 
The variance is proportional to: 


(4) <2 = 1 2 kjkjSij 

j =i 

In order to maximize Z 2 (3) under the condition that the variance (4) 
is a constant, we form the function: 


(5) 


F = Z 2 - IQ = 2 2 k,k ) d,d 1 - /.kt^ 

i=i j=i 


where 2 is a Lagrange multiplier. 

We differentiate (5) partially with respect to k } (/ = 1, 2, 
obtain after some simplifications the set of equations: 


p) and 


( 6 ) 


dj 2 k t d, = 22 kiSjj (j = L 2, • • • p) 
1= 1 i = 1 

V 


To simplify computations we put 2 = 2 k,d, and we obtain the system 
of equations: , = l 

^ 12^2 "H * * * Sipkp = d l 


(7) 


-f- S.^ky ~r 


S-zpk p — d 2 


"b ^2i>k‘2 


S P pk p 



The solutions /c, are proportional to the estimates of the coefficients of 
the linear function which in the population corresponding to the sample 
discriminates best between the two groups in the sense defined above. 
The similarity of the system of equations (7) and the normal equations in 
multiple regression analysis [formula (5) of section 5.1] should be noted. 

A test of significance has been indicated by R. A. Fisher which makes 
use of Hotelling’s generalized Student distribution. 2 This distribution 
was derived in 1931. Define a quantity analogous to the multiple 
correlation coefficient: 



Then the variance ratio 


N.N^k.d, -f • • 

N 





(N-p- \)R* 

pi 1 - R 2 ) 


“ H. Hotelling: “The generalization of Student's ratio," Anna/s of Mathe¬ 
matical Statistics , vol. 2 (1931), pp. 360 ff. 




98 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.2 


has Snedecor’s F-distribution for n x = p and n 2 = N — p — 1 degrees of 
freedom. In this fashion we can test the hypothesis that the empirical 
discriminant function may have arisen out of pure chance, if in reality 
there is no difference at all between the variates in the two groups in the 
population. This test is also related to some work of the Indian school 
of statistics. 3 

Example 1. The method has been applied in a most interesting way 
by David Durand to a set of financial data. He utilized it to discriminate 
between good and bad loans . 4 

Let X x be the down payment, X 2 the price, X 3 the monthly income (all 
in dollars), and X 4 the length of the contract in months. Then a linear 
function has been determined which, in a sample of 484 good and 485 
bad loans, discriminates best between good and bad loans: 

(10) Z = X,- 0.174Ar 2 -f 0.124^- 6.45 JT 4 

This method may also be applied to the classification of various econ¬ 
omic phenomena. For instance, a group of prices called sensitive prices 5 
was frequently used in an attempt to anticipate more general price move¬ 
ments. The question whether a given price should be included in this 
group could be decided by finding a set of relevant measurements for 
each of a number of sensitive and non-sensitive prices and then computing 
the linear combination of the measurements which discriminates most 
successfully between sensitive and non-sensitive prices. This discriminant 
function could then be used in order to classify a given price into one or 
the other of the two groups. 

A similar problem is the classification of prices into prices of consumers’ 
goods and producers' goods in relation to their behavior in the cycle 
which we propose to illustrate by an example. 


:l See, e.g., P. C. Mahalanobis: '‘On generalized distance in statistics," Pro¬ 
ceedings of the National Institute of Science of India , vol. 12 (1936), pp. 49 ff. 

R. C. Bose and S. N. Roy: “The distribution of the Studentized £> 2 -Statistic," 
Sankhya , vol. 4 (1938), pp. 19 ff. P. C. Mahalanobis, R. C. Bose, and P. N. 
Roy: “Normalisation of statistical variates and the use of rectangular co¬ 
ordinates in the theory of sampling distributions," ibid., vol. 3 (1937), pp. 1 ff. 

S. N. Roy: “Analysis of variance for multivariate normal populations, ibid., 

vol. 6 (1940), pp. 35 ff. 

4 D. Durand: “Risk elements in consumer installment financing. Financial 

Research Program, Studies in Consumer Installment Financing 8, National 
Bureau of Economic Research (New York, 1941), pp. 125 ff. 

5 E. Wageman: Konjunkturlehre (Berlin, 1928), pp. 128 ff., 137 ff. 



6 . 2 ] 


DISCRIMINANT ANALYSIS 


99 


Example 2. We have tried to apply the methods of discriminant 
analysis to the following problem: 6 Is it possible to distinguish between 
the prices of producers' goods and the prices of consumers' goods on the 
basis of certain measurements connected with their behavior during the 
business cycle? We are going to use some data collected and analyzed 
in a previous book of the author. 7 We will use monthly English wholesale 
prices, taken from the period 1860-1913. The seasonal variation and 
the trend were eliminated from these series by a system of moving averages. 
This method is described in section 8.2. 

We denote by X x the median length of the cycle in months. This is 
the median of all cycles in the period, measured from minimum to 
minimum. 

X 2 is the median percentage of the duration of cyclically rising prices 
relative to the total duration of the cycle. 

X 3 is the median cyclical amplitude expressed as percentage of the 
trend. 

X 4 is the mean monthly rate of change in the cycle (percentage of trend 
per month). 

We will try to construct a kind of “index" which will best discriminate 
between consumers' goods and producers' goods on the basis of the 
measures of cyclical behavior indicated above. If we can do this, we 
will have a method which in a sense would measure most efficiently the 
“cyclical distance" between prices of various commodities. 8 

The linear discriminant function (3) will be in our case: 

0 0 Z — k l X l H- k 2 X 2 + k :i X :i -r k x X x 

We have chosen 19 prices. The various measurements for the data 

are indicated in Table 1. These prices are classified into two groups: 

consumers' goods (9 prices) and producers’ goods (10 prices). This is 

only a small sample, but we may be able to draw some tentative conclusions 
from it. 

_ X i are the cyclical measurements, Xf the averages for consumers’ goods, 
X,** the averages for producers’ goods, X, the general averages, d, the 
differences between the averages of the two groups. 


6 G. Tintner: “Some applications of multivariate analysis to economic data,'* 
Journal of the American Statistical Association , vol. 41 (1946), pp. 476 fT. 

7 G. Tintner: Prices in the Trade Cycle (Vienna, 1935), Table 7, pp. 142 ft'. 

8 H. Hotelling: “Spaces of statistics and their metrization," Science , vol. 47 



100 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.2 


TABLE 1 



Cyclical Measurements 



Price 


x i 

*3 

*4 

Z 


Consumers' 

Goods 



Rice 

72 

50 

8 

0.5 

0.186 

Tea 

66.5 

48 

15 

1.0 

0.224 

Sugar 

54 

57 

14 

1.0 

0.200 

Flour 

67 

60 

15 

0.9 

0.228 

Coffee 

44 

57 

14 

0.3 

0.183 

Potatoes 

41 

52 

18 

1.9 

0.207 

Butter 

34.5 

50 

4 

0.5 

0.098 

Cheese 

34.5 

46 

8.5 

1.0 

0.128 

Beef 

24 

54 

3 

1.2 

0.076 

Average, X* 

48.611 

52.667 

11.056 

0.922 

0.170017 


Producers’ Goods 



Gasoline 

57 

57 

12.5 

0.9 

0.194 

Lead 

100 

54 

17 

0.5 

0.293 

Pig iron 

100 

32 

16.5 

0.7 

0.283 

Copper 

96.5 

65 

20.5 

0.9 

0.315 

Zinc 

79 

51 

18 

0.9 

0.266 

Tin 

78.5 

53 

18 

1.2 

0.266 

Rubber 

48 

50 

21 

1.6 

0.238 

Quicksilver 

155 

44 

20.5 

1.4 

0.404 

Copper sheets 

84 

64 

13 

0.8 

0.243 

Iron bars 

105 

35 

17 

1.8 

0.298 

Average, Xf* 

90.30 

50.500 

17.400 

1.070 

0.279983 

General average, X, 

70.553 

51.526 

14.395 

1.000 

0.227871 

Difference, d; 

41.689 

-2.167 

6.344 

0.148 

0.109921 

Coal 

77 

48 

12 

0.6 

0.220 


The matrix of the sums of squares and products computed from the 
data given is presented in Table 2. Only the elements above the diagonal 
are given since the matrix is symmetrical. 


TABLE 2 

Sums of Squares and Products 




^2 

x 3 

x 4 


18,382.456 

— 1,350.966 

1,833.410 

21.393 

*2 

*3 

*4 


1,275.349 

-45.623 

495.146 

-18.794 

16.345 

3.460 



6 . 2 ] 


DISCRIMINANT ANALYSIS 


101 


The system of equations (7) is used to determine our estimates k { . 
The necessary data are taken from Table 2, and d, from Table 1 : 

18,382.456^ - l,350.966/r 2 + 1,833.410/c ;i + 21.393 k, = 41.689 
( 12 ) ~ *>350.966/r, -f 1,275.349/q, - 45.623 At 3 - 18.794At 4 =-2.167 

1,833.4 lOAq — 45.623 k 2 + 495.146£, -f 16.345A: 4 = 6.344 

21.393^- 18.794Ar 2 + 16.345ft 3 + 3.460 At 4 = 0.148 

The solutions are indicated in the following linear discriminant function : 


(13) Z = 0.001605 + 0.000277 Z 2 + 0.006825 Z 3 + 0.0021 15T 4 

The meaning of function (13) is as follows: If Z is larger than at tne 
general mean (0.227871), a commodity should be classified as a producers’ 
good; in the opposite case, as a consumers’ good. The average Z for 
producers' goods is 0.279983 and for consumers’ goods 0.170017, All 

one consumers’ good (flour) and one 
producers' good (gasoline) are misclassified. The values of Z for various 

commodities and also for the averages are indicated in the last column 
of Table 1. 


It is interesting to note that in this function Z the largest weight has 
been given to Z 3 (amplitude). This seems to indicate that the cyclical 
amplitude is possibly more important than other characteristics in dis¬ 
tinguishing consumers’ and producers’ goods. 

Our results may be used for the classification of dubious commodities. 
Such a commodity is, for instance, coal, which during the period serves 
both as a consumers’ good (for heating of houses, cooking, etc.) and as 
a Producers’ good (e.g., for making steel). The value of the discriminant 
unction Z is, for coal, 0.220, as indicated in the last line of our Table 1 
This is just below the average Z (0.227871). Hence we must classify coal 
as a consumers’ good rather than a producers’ good. Since the value 

°, c S Vgry nCar l ° the avera g e we may, however, doubt whether this 
classification is really valid. 

In order to test the significance of our discriminant function we proceed 
as follows: We compute R 2 from formula (8) as R 2 (90) (0.109921)/19 

.520678. The following is the variance ratio formula (9): F 3.802 

“ IS clear 'y significant. We require at the 5 per cent level of signifi- 
ance for 4 and 14 degrees of freedom, an F of only 3.1 I ; at the I per 
cent level an F of 5.03 is required. The null hypothesis that our dis¬ 
criminant function could have arisen by pure chance is refuted by the 

, , 1IS ,CSt LOuld also have been made by Hotelling’s methods 
‘ ut first computing the discriminant function.) 

Hence n is likely that in the population some difference exists between 



102 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.3 


the two groups. We would conclude that there probably is an effective 
linear combination of the cyclical measures indicated above which dis¬ 
tinguishes successfully between consumers’ and producers’ goods on the 
basis of the data used. It is interesting to note that if another consumers' 
good—namely, pepper—is included in the analysis we do not achieve 
significant results. But this commodity shows a very peculiar cyclical 
behavior probably due to specific fluctuations in production. Hence it 
may be better to neglect it. 

Our result is possibly of some economic importance. It should be 
interpreted in the light of the obvious limitations of our methods in 
dealing with this problem. The characteristics indicated in Table I are 
probably not really normally distributed in spite of the fact that the 
median in large samples tends under certain conditions to be normally 
distributed. It is also possible that a non-linear combination of charac¬ 
teristics would be more adequate in our case. 

If our results were more trustworthy and also were based upon a larger 
sample of commodities covering a longer period in several countries, we 
could draw more significant conclusions. We may still tentatively say 
that our analysis seems to support to a certain degree the contentions of 
the majority of business cycle theorists. 9 It provides some evidence for 
the assertion that there is a difference in the cyclical behavior of pro¬ 
ducer's goods and consumer's goods as far as their prices are concerned. 

6.3 Principal Components 

The method presented here was first devised by Hotelling 1 to deal with 
the problem of factor analysis in psychology: 2 How can we analyze a 
group of variables into a more fundamental set of independent, i.e., 
orthogonal, components called “factors”? 


9 G. von Haberler: Prosperity and Depression (Geneva, 1939), pp. 279 ff. 
J. Tinbergen: The Dynamics of Business Cycles (Chicago, 1950), pp. 171 ff. 

1 H. Hotelling: “Analysis of a complex of statistical variables into principal 
components,” Journal of Educational Psychology , vol. 24 (1933), pp. 417 ff., 
498 ff. See also S. S. Wilks: Mathematical Statistics (Princeton, 1943), pp. 
252 ff. 

2 K. J. Holzinger and H. H. Harman: Factor Analysis (Chicago, 1941). 
M. G. Kendall and B. Babington Smith: “Factor analysis,” Journal of the 
Royal Statistical Society , series B, vol. 12 (1950), pp. 60 ff. G. H. Thomson. 
The Factorial Analysis of Human Ability (2nd ed.. New York, 1942). D. N. 
Lawley: “The estimation of factor loadings by the method of maximum 
likelihood,” Proceedings of the Royal Society of Edinburgh, vol. 60 (1940), pp. 
64 ff. 



6.3] 


PRINCIPAL COMPONENTS 


103 


Girshick 3 has shown in an important article that the same method can 

also be applied to the solution of other problems: We have a set of 

variates, each of which consists of the sum of a systematic component 

arrd an error (section 6.5). How can we find a linear function of the 

variates which is least subject to the “errors”? Another problem which 

leads to the same method is the following: Can we find a linear function 

of the given variates which is such that the sum of the squares of the 

correlation coefficients of each variate with the requested linear function 

is a maximum? Girshick also showed that the principal components 

method leads to maximum likelihood estimates if the variates are normally 
distributed. 

Assume that we want to replace a set of standardized variables z, ■ ■ - z 
by a more fundamental set of variables «, • • • u p . Let us define' " 

z u = k n u lt + k l2 u 2t + • • • + k lp u pl 

0 ) . 


z vt — k pl u u + k p2 u 2t + ‘ + k pp u pt 

The variables iq • • • u p are the principal components. They are 
orthogonal i.e., £ u„u u = 0 (/ ^ j). The k„ are constants. We want 
the «, to reproduce the original correlations between the variables z,: 

( 2 ) r n = k n k j i -f k i2 k, 2 + ■ • • + k, p k ip ( i,j = 1, 2, • • ■ p) 

The expression : 


(3) 


kl \i ' k \i + * * * + k 2 9i 


(/ — 1, 2, • • • p) 

a principal component u, to the 

variances of all standardized variables z,. 

We want to maximize the contribution of the first principal component: 


(4) 


s i= 2 k\, 

i= 1 


under the conditions (2). We form the function: 


(5) 


V p p p 

F = 2 k 2 a - 1 2 2 t*ak ls k j 

i = 1 ) = 1 * = 1 


i = 1 


)s 


where the Ha are Lagrange multipliers. Let k n , k 2l , ■ ■ ■ k pl 


be the 


As^i'r' Girs ' lic , k : Pr 'ncipal components," Journal of,he American Statistical 
Association, vol. 31 (1936), pp. 519 if. 




104 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.3 


coefficients of the first principal component. We differentiate partially 
with respect to k is and obtain after a simplification the system: 


( 6 ) 


k n ~ 2 t*u k n = 0 

o =i 

V 

— Z PijkfM = o 

i = i 


(s = 2, 3, • • • p) 


We multiply each equation in this system by k a and sum with respect to /. 
This gives: 


(7) 


2 — Z Z = 0 

i=i » = i j =i 


V V 

~ Z Z Pijki\kj S = 0 ($ = 2, 3, • • • p) 

i=l j=l 


P 

But from the first equation in (6) we have % fi ij k il = k n . We put also 


V 


1 = 1 


Z £ 2 ti = Aj, a constant, and obtain: 

7 = 1 

h ~ Z k 2 n = 0 

(8) '7 

~ Z kj\kjg = 0 (s = 2, 3, • • /?) 

3 = 1 

Now we multiply each equation in this system by k i8 and sum with 
respect to s: 


(9) 


*1*« - 2 2 = 0 (/ = 1, 2, • • • p) 

j =1 *=1 


But by using (2) we have the system: 


^11 + r i 2 ^ 2 i + * * ' •+ r ipk p \ == ^1 k 


ii 


( 10 ) 


r ip^n T* r 2pk 2 \ -T 




pi 


• • 

This system of linear homogeneous equations can have a non-trivia 
solution only if its determinant is equal to zero (Appendix A. 1.2): 


( 11 ) 


(1 — A,) r 


12 


1 P 


1 V 


2p 


= 0 


(I - A,) 





6 . 3 ] 


PRINCIPAL COMPONENTS 


105 


It can be shown that the largest root of (I I) is associated with the first 
principal component, which accounts for most of the variance of the 
standardized variables. This follows from the first equation in (8). The 

coefficients of the first principal component can be computed from the 
homogenous system of linear equations (10). 

Next we compute the second largest root of the determinantal equation 

(11) . Inserting this root into (10) instead of we compute the second 

principal component, which contributes most to the variance of the 

standardized variables after the first principal component has been 
removed, etc. 

Assume now that we have a set of variables z, consisting of two parts: 
z, — «, + y, , where m, is the “true" value or mathematical expectation 
and y,. is a “random error.” This idea is closely related to the methods 
of weighted regression discussed in section 6.5. The random errors have 
the same variance <r 2 and are independent of each other. We want to 
find a linear function: u = *,*. + k 2 z 2 + • • • + k p z„. This ought to 
be chosen in such a fashion that the variance of the errors, which is 
proportional to A:, 2 - • • • . k p \ is a minimum and that the variance 
of » is 1. Girshick 4 has shown that this leads to the previous method 
We must choose as l the largest root of the determinantal equation (11) 

l"Lt° P th !, SeCOnd Subscri P‘ of the *« w e get exactly the same solutions 
s before. This second interpretation may be more useful than the 
original one in econometric research. 

We have to minimize: 

( 12 ) 


p 


a 2 2 k 2 


i = I 


cnn*/ 2 ' S . th \ variance of tht (independent) random errors The 
condition that the variance of u should be 1 is: * * 

(13) 


p p 

2 2 k l k J r ii = 1 

»= i j = i 


We introduce a Lagrange multiplier /, and form the new function: 
( ,4 ) F = ^2 V L 2 _ .. V ^ / , 


F= 2k l k ] r ll 


i = 1 


j =1 


If ^ differentiate partially with respect to the variables k and put 

*~srarssss.ts: 

sari*— 



106 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.3 


fulfilled. We choose again the largest root of the determinantal equation 
(11). 

Finally, define a linear function u = k x z x -f • • • -f - k ,z r We want 
to determine the coefficients k i in such a fashion that the'sum of all the 
squares of the correlation coefficients of u with the original standardized 
variables z 2 , • • • z p becomes a maximum. We use again the condition 
that the variance of u should be 1. It can be shown 5 that this leads 
again to the former method if we drop the second subscript in (10) and 
take the largest root of (1 1). 

We want now to maximize: 

( 1 5) T= i (£ 

1=1 j =1 

which is the sum of the squares of the correlation coefficients of the index 

u = k i z i + k 2 z 2 + • * * T- k p z p with all the standardized variables z lf z 2 , 

* • • z p . The condition that the variance of this index should be 1 is 
again (13). 

We form the new function: 


< I6 > <? = £ (I *,/•„)*-a I 

i=i j= i i=i j= i 

where A is a Lagrange multiplier. 

Differentiating (16) with respect to k j and setting the result equal to 
zero, we obtain: 

( |? ) 2 2 A 2 kfi, = 0 (s = I, 2, ■ • • P) 

*=i i=i j=\ 

Now let [r tJ J be the matrix which is inverse to the correlation matrix 
(Appendix A. 1.2). Multiplying by the elements of this matrix and 
adding, we obtain a system of equations similar to (10). 

Girshick 6 has also shown that the method of principal components 
results from the maximum likelihood approach if the original variates 
X lt follow a normal multivariate distribution. Hence it follows that 
it provides estimates of the principal components in the population 
corresponding to the sample which have certain optimum properties 
associated with maximum likelihood solutions (see section 5.2). The 
results of this method need not, however, be meaningful in economic 
terms. The problem of identification will be discussed in section 6.5 
and Chapter 7. 


5 Ibid., pp. .525 ff. 

6 Ibid., pp. 527 ff. 



6 . 3 ] 


PRINCIPAL COMPONENTS 


107 


The distribution of the latent roots of the determinantal equation (ID 
has been established by various authors. 7 

Use has been made of the method by M. J. Hagood and E. H. Bernert* 
in the held of sampling of economic data. 

„ A P° SSlble USC of factor analysis in economic statistics is the following- 
We have frequently series of data which are very short, e.g., about 20 

yearly observations. By replacing several variables with a few principal 
components we may be able to save a considerable number of degrees 

Wh ' Ch WOuld otherwise not be available. This may conceiv¬ 
ably be of some importance in practical applications of correlation methods 
to economic data. There is evidently a relation with the aggregation 
problem. This will be discussed below. 88 8 

The most important class of problems to which the method of principal 
components could be applied are perhaps those connected with statistical 
questions arising from the transition from microeconomic to macro¬ 
economic analysis. 9 These questions have been discussed from the point 

of view of economic theory,* 9 but never verified statistically by the use 
of valid methods. J J 

In aTT*, ■ R ' St ° ne " has a l , (’ ,ied ""Ibod of principal components 
locks of transactions in ihe American economy. The data are taken 

rom the ,ears 1922-38. The proportion of the variance of each variable 

,° f “* ,hrCe ' are ' S1 faC, ° ri “" d »rianc 

Snadon " d "7** - — «*#» obtai^from non-L.r 

of roots of certain determinantal equations,” ibid., pp ?50 ff s S "wilks" 
“T PP- 2<" S'- (unpublished results by A. M. Mood, 

siraihVing » •■"«( "Component indexes a, a basis for 

(1^45), pp. 330 f) P ’ ° " e Amencan Statistical Association , vol. 40 

economics "T “ Pr0 P a S ation P roblems *"d impulse problems in dynamic 
171 ff ■ M kZT' C T }S HOn ° r ° , Gus,av Casse ‘ (London, 1933), pp 

nterrla, voi. 3 S'*) no ° f busi "^ ^ EcoZ 

(New York, 1947 ). ’ PP ' R Kleln - The Keynesian Revolution 

liS; L "" S ' : P "’’ FkM > — (Bloomington, 

*- zszz&z »f *—* 



108 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.3 


TABLE I 

Variable Proportion of Variance Accounted for by 



First 

Factor 

Second 

Factor 

Third 

Factor 

Residua! 

Employees' compensation 

0.9666 

0.0229 

0.0089 

0.0016 

Consumers' perishable and pro- 

ducers' durable goods 

0.9068 

0.0009 

0.0851 

0.0072 

Net savings of enterprise plus capital 

revaluation 

0.5018 

0.4918 

0.0006 

0.0058 

Consumers' semidurable plus dur- 

able goods 

0.9462 

0.0087 

0.0296 

0.0155 

Consumers' services 

0.8267 

0.1215 

0.0431 

0.0087 

Construction 

0.731 1 

0.0051 

0.2581 

0.0057 

Net public outlay 

0.4992 

0.0019 

0.3367 

0.1622 

Net increase in inventories 

0.4230 

0.1010 

0.1093 

0.3667 

Inventory revaluation adjustment 

0.0889 

0.7209 

0.0252 

0.1650 

Net rent received by individuals 

0.5467 

0.0332 

0.3961 

0.0240 

Entrepreneurial withdrawals 

0.8430 

0.0558 

0.0849 

0.0163 

Dividends 

0.7184 

0.1382 

0.0590 

0.0834 

Adjustment for depreciation 

0.8270 

0.1333 

0.0066 

0.0331 

Interest 

0.0000 

0.7486 

0.0001 

0.2513 

Dividends, etc., from abroad less 
direct taxes of individuals plus 
veterans' bonus plus social secu¬ 
rity benefits less employers' con- 

tribution to social security 

0.2108 

0.0001 

0.0537 

0.7354 

Adjustment for depreciation and 

depletion 

0.7078 

0.0885 

0.0074 

0.1963 

Foreign balance, including foreign 

tourist expenditure 

0.0447 

0.0051 

0.3883 

0.5619 

Total 

0.8076 

0.1059 

0.0609 

0.0256 


The three largest factors 

can tentatively be identified with income (or 

output), rate of change of income, and a 

time trend. 

Correlation coeffi- 

cients of the three factors with these variables are shown in Table 2. 


TABLE 2 




First 

Second 

Third 


Factor 

Factor 

Factor 

Income 

0.995 

-0.041 

0.057 

Rate of change of income 

-0.056 

0.948 

-0.124 

Time trend 

-0.369 

-0.282 

-0.836 



6.3] 


PRINCIPAL COMPONENTS 


109 


It is apparent that we can represent the seventeen variables listed in 

Table 1 reasonably well by not more than three factors. Together they 

account for more than 97 per cent of the total variance of all variables. 

Hence in a statistical analysis of economic data the use of the three 

principal components instead of the seventeen original variables will save 

14 degrees of freedom. This is conceivably an important gain because 

of the short length of many economic series available for statistical 
analysis. 

The ideas contained in Stone's w'ork are extremely interesting and very 
suggestive for econometric work. It may be asked, however, whether 
the elaborate principal components analysis was necessary if there are 
three economic variables (income, rate of change of income, and time) 
which can be more or less identified with the three factors which account 
for most of the variance of the seventeen blocks of transactions. It 
might have been simpler to compute the multiple regression equations 
of all the variables separately upon these three explanatory variables. 

Another use of principal components is in the field of general index 
numbers. The practical importance of a solution of this problem lies in 
the following: 12 Many questions of economic policy require a knowledge 
of the broad economic relationships which are discussed in economic 
theory under the name of general equilibrium. This is true, for instance, 
of problems of full employment, taxation, subsidies, etc., which ought to 
be discussed in the most general terms possible. It is obviously im¬ 
possible to verify statistically a complete general equilibrium system 
because of the great number of variables involved. It would be necessary 


12 G. von Haberler: Der Sinn der Indexzahlen (Tuebingen, 1927). R. Frisch - 
“Annual survey of economic theory: the problem of index numbers ” Econo¬ 
metric^ vol. 4 (1936), pp. I ff. SeealsoL.R. Klein: “Macro-economics and the 

theory of rational behaviour," ibid., vol. 14 (1946), pp. 93 ff. K. May: “The 
aggregation problem for a one-industry model,” ibid., vol. 14 (1946), pp. 285 ff. 
S. S. Pou: "A note on macro-economics,” ibid., vol. 14 (1946), pp. 299 ff. 
L. R. Klein: “Remarks on the theory of aggregation,” ibid., vol. 14 (1946) 
mo^? 3 ff ' K ‘ May: “ Techno, °g ical change and aggregation,” ibid., vol. 15 

( 47), pp. 5 Iff. W. W. Leontief: “Introduction to the theory of the internal 

structure of functional relationships,” ibid., vol. 15 (1947), pp. 361 ff. A. Nataf - 
,, Q U . r J a P oss,b,l,tc d e la construction de certains macromodeles,” ibid., vol. 16 
), pp. 232 ff. A. Nataf and R. Roy: "Remarques et suggestions relatives 
aux nombres indices,” ibid., vol. 16 (1948), pp. 330 ff. A. L. Bowley: Elements 

t f, IT" ^ Cd -’ LOnd ° n ’ 1948,1 PP - 196 ff W ' W '^ler: drundnss der 

N T ’n m I! " 3 ’ 1948,1 PP - 206 '• Fisher: The Making of Index 

iw, ers (New York, 1922). G. U. Yule and M. G. Kendall: Introduction to 
Theory of Statistics (New York, 1950), pp. 590 ff. 



110 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.3 


to include all prices of all commodities, all quantities of all commodities 

produced and consumed, all interest rates, etc. It is obvious that such 

a procedure would involve literally thousands and possibly millions of 
variables. 13 

Hence it seems to be necessary to substitute certain “indices'’ for groups 
of these variables. This procedure has been exemplified above in sections 
3.7 and 3.8. We may want, for instance, to represent all wholesale prices 
by an index of wholesale prices, all quantities produced by an index of 
production, etc. Which particular indices will be chosen depends of 
course upon the nature of the economic problem considered. 

But it is also of interest to establish the statistical validity of these 
indices in the following sense: How perfect is the representation of all 
the various prices, for instance, by some general price index? Which 
percentage of the variance of various quantities produced in the economy 
is accounted for by a certain production index? We believe that questions 
of this nature can be answered tentatively by the method of principal 
components. 

These problems are evidently connected with the general problem of 
index numbers, i.e., the aggregation problem. They have very far- 
reaching significance for the choice between several possible macro- 
economic models 14 and their empirical validity, if we strive for econometric 
applications of these models. 

The very efficient computational methods developed by Hotelling 15 
have been utilized in the following examples; they are discussed in 
Appendix A.2. 

Example 2. This example deals with an attempt to determine the 


13 F. A. von Hayek: Collectivist Economic Planning (London, 1947), pp. 
208 ff. G. B. Dantzig: “Maximization of a linear function of variables subject 
to linear inequalities," T. C. Koopmans, ed.: Activity Analysis of Production 
and Allocation (New York, 1951), pp. 339 ff. G. W. Brown and T. C. Koop¬ 
mans: “Computational suggestions for maximizing a linear function subject 
to linear inequalities," ibid., pp. 377 ff. G. W. Brown and J. von Neumann: 
“Solutions of games by differential equations," in H. W. Kuhn and A. W. 
Tucker, eds.: Contributions to the Theory of Games (Princeton,*1951), pp. 73 ff. 

14 G. Tintner: “Foundations of probability and statistical inference," Journal 
of the Royal Statistical Society , vol. 112 (1949), pp. 252 ff.; “Static macro- 
economic models and their econometric verification," Met roe cono mica, vol. I 
(1949), pp. 48 ff. 

15 H. Hotelling: “Simplified calculation of principal components," Psycho - 
metrika, vol. 1 (1936), pp. 27 ff. P. S. Dwyer: Linear Computations (New York, 
1951), pp. 219 ff., 318 ff. 



6.3] 


PRINCIPAL COMPONENTS 


III 


principal components of a set of production indices™ This example is 
somewhat related to an earlier essay of E. C. Rhodes. 17 

Denote by X l an index for the production of manufactured durable 

goods in the United States, X 2 of non-durable manufactured goods, X 3 

of minerals, and X , of agricultural products. All indices are computed 

with the base 1935-39 = 100. The period covered is 1919-39. We use 

annual figures. The indices X A , X 2 , X :i are taken from the publications 

of the Federal Reserve Board, and X t from the Yearbook of Agricultural 

Statistics of the Department of Agriculture. The correlation matrix is 
given in Table 3. 

TABLE 3 

Correlation Matrix 



Xl 

*2 

X* 


1.000000 

0.495941 

0.872836 

0.481240 


1.000000 

0.768279 

0.709807 



1.000000 

0.712358 




1 . 000000 .. 


These four variables can be analyzed into various components or 
factors. We want to find the first principal component which contributes 
most to the variances of the standardized variables z l • • • 

The system of linear equations (10) which yields the coefficients of the 
first and largest principal component is: 



1.000000*, t + 0.495941C,, 
0.495941 *„ + 1.000000*., 
0.872863^ + 0.768279*.,, 
0.481240*1! + 0.709807*.,, 


* 0.872836*3, -f- 0.481240*,, = /*,, 

• 0.768279*3, 4- 0.709807*,! = /*,, 

• 1.000000*3, + 0.712358*,, = /* 31 

* 0.712358*3, + 1.000000*,, = /* 4 , 


This system of linear homogeneous equations can have non-trivial 

solutions only if its determinants become zero. The determinantal 
equation (II) becomes: 


(19) 


1.000000 - / 
0.495941 
0.872836 
0.481240 


0.495941 
1.000000 - / 
0.768279 
0.709807 


0.872836 
0.768279 
1.000000 - / 
0.712358 


0.481240 
0.709807 
0.712358 
1.000000 - / 



G. Tintner: “Some applications of multivariate analysis to economic data “ 
Journal of the American Statistical Association , vol. 41 (1946), pp. 482 ff. C F 

Carter, W. B. Reddaway, R. Stone: The Measurement of Production Movements 
(Cambridge, 1948). 

, 17 , E> C Rhodes : “The construction of an index of business activity,” Journal 
of the Royal Statistical Society, vol. 100 (1937), pp. 18 ff. 



112 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.3 


We have = 3.033424. This is the largest root of equation (19). 
The contributions of the first principal component to the variance of the 
standardized variables are the squares of the following quantities: 

k u = 0.817391, k 2l = 0.888102, k 3l = 0.951934, k 4l = 0.818776 

Hence it follows that the first principal component “explains” about 
67 per cent of the variance of z 4 , about 79 per cent of the variance of z 2 , 
about 91 per cent of the variance of z 3 , and about 67 per cent of the 
variance of z 4 . 

The same results can also be used to exemplify the two other approaches 
to principal component analysis discussed above. The function u which 
minimizes the error variances and which also has the maximum sum of 
squares of correlation coefficients with all the variables (while its own 
variance is 1) is: 

(20) u = 0.269462z t + 0.292772z 2 + 0.313815z 3 + 0.269918z 4 

The coefficients in (20) are proportional to the previous ones. It is 
interesting to note that minerals (z 3 ) have the greatest weight. 

The total variance of the four standardized variables is evidently 4. 
Hence, since a y = 3.033424, it appears that the first principal component 
“explains" about 76 per cent of the total variance of the standardized 
variates z x • • • z 4 . 

The tentative economic interpretation of these results would appear 
to be as follows: It seems that the fluctuations in the four indices may be 
reasonably well represented by one factor. This general “factor" would 
account for more than three-fourths of the total variance of the individual 
production indices. A more detailed analysis should of course be carried 
out not only to confirm this result with respect to the broad categories 
of production used here but also to apply it to the production of individual 
commodities. 

This result, while not unexpected, is by no means trivial. It would be 
possible, for instance, to imagine an economy where the industrial sector 
and the agricultural sector have very little relationship. Then we would 
have two important factors, say industrial production and agricultural 
production. The first named would probably account for most of the 
variance in X l9 X 2 , and X 3 , and the second for most of the variance of 
X 4 . This is obviously not the case in our example. The factor "produc¬ 
tion in general" which we have indicated in (20) seems to account for 
most of the variance of X 4 as well as for most of the variances of all 
other variables. 



6.3] 


PRINCIPAL COMPONENTS 


113 


Example 3. A second example, 18 which will not be presented in such 
great detail, deals with prices. (Some of the data are presented in Example 
3 of section 6.4.) Denote by X 5 an index of United States wholesale 
farm prices, by X e an index of wholesale food prices, by X - an index of 
all other wholesale prices. These are taken from the Bureau of Labor 
Statistics indices for the period 1919-39. The base year of the indices is 
1926. The indices are given annually. We want to find again the first 
principal component which accounts for most of the variances of the 
standardized variates z 5 , z 6 , and z 7 . 

An analysis of the data reveals that the contribution of the first prin¬ 
cipal component to the variance of each standardized variable z 5 , z 6 , z 7 
is the square of the corresponding coefficient: k hX = 0.987867, k 
= 0.990160, and k lx = 0.957621. It appears that the first principal 
component accounts for about 97 per cent of the variance of z 5 , about 

98 per cent of the variance of z 6 , and about 92 per cent of the variance 
of z 7 . 

The function u y which minimizes the variance of the random errors 
and which also maximizes the sum of the squares of the correlation 
coefficients with all the variables (while its own variance is 1), is: 

< 21 ) u = 0.343693z 5 -f 0.344845z 6 -f 0.333508z 7 


The coefficients in (21) are again proportional to the k's indicated above. 

It is remarkable to note that here the weights given to the various 

variables are approximately the same. The greatest root of the deter- 

minantal equation is here ?. x --- 2.871360. The total variance of the three 

standardized variables is 3. The first principal component can be said 

to account for more than 95 per cent of the total variance of the three 
standardized variables. 

It is of some interest to correlate our “index” (21) with the All Com¬ 
modities Wholesale Price Index computed by the Bureau of Labor 
Statistics. The resulting correlation coefficient is 0.991 and is highly 
significant for 19 degrees of freedom. 

Hence we would conclude that on the basis of the evidence presented 
it seems likely that a “general" price index (21) could very well explain 
most of the variability of prices. It appears that the residual variabilitv 
IS rea,, y a,most negligible. This result should, however, be checked by 
an analysis of the prices of individual goods rather than of broad price 
categories like those used in our own procedure. 


G.Tintner: ‘Some applications of multivariate analysis to economic data ” 

Journal of the American Statistical Association , vol. 41 (1946), pp. 484 ff. 



114 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.4 


This result is again what we should expect. But it is by no means as 
obvious as it seems. We could, for instance, imagine again an economy 
in which the industrial and the agricultural sectors have very little con¬ 
nection. Then we would have to distinguish two factors instead of one, 
say industrial prices and agricultural prices. The first factor* would 
account for most of the variance of X 7 and the second for most of the 
variances of X 5 and X 6 . This is obviously not the case in the American 
economy during the period analyzed. The index indicated in (21), which 
represents general price movements, never accounts for less than 90 per 
cent of the variance of any among our variables. 

6.4 Canonical Correlations 

In economic statistics we desire sometimes to find the relationship 
between sets of variables. The method of canonical correlations, intro¬ 
duced into statistics by Hotelling, 1 provides means for accomplishing this. 
We replace each of the two sets of variates by a linear combination of the 
variates contained in each set, the canonical variate. Then we endeavor 
to maximize the correlation between these two canonical variates, the 
canonical correlation. Methods for dealing with systems of equations 
will be presented in section 6.5 and Chapter 7. 

Assume that we have p variables X l9 X 2 , • • • X p and TV observations 
on each variable. The variables are divided into two groups: X l% X 2 , 

• • • X p > and X p '_ hl ', X p ' +2 • • • X p . We want t » find two linear 
functions of the standardized variables: 

(1) U = k 1 z 1 + k 2 z 2 + • • • + k p 'Z p ' 

and 

( 2 ) V=-- k p +l z p . + l + V+2V+2 + ' • • + k p z„ 

which have maximum correlation with each other. The variances of 
U and V are respectively: 

( 3 ) 2 2 r u k i k , = 1 

i=l j=\ 

( 4 ) 2 2 r n k ,k, = 1 

i = p' + \ j = p'+ 1 


1 H. Hotelling: “Relations between two sets of variables," Biometrika , vol. 
28 (T936), pp. 321 flf. See also S. S. Wilks: Mathematical Statistics (Princeton, 
1943), pp. 257 ff. M. G. Kendall: The Advanced Theory of Statistics, vol. 2 
(London, 1946), pp. 348 ff! 



6.4 ] 


CANONICAL CORRELATIONS 


115 


Both variances are supposed to be equal to 1. The canonical correlation 
coefficient between U and V becomes: 


( 5 ) 


P’ V 

R = 2 2 r uk i k i 

l=| j^p'+l 


Th.s is to be made a maximum under the conditions that the two variances 
are equal to I. 

In order to maximize R (5) unde, the two conditions (3) and (4) we 
introduce two Lagrange multipliers and Then we form the new 


P' P 

( 6 ) 


.2 .2 r ii k i k j — i A I 2 r ii k i k j - £ /, y J rkk 

We differentiate (6) with respect to the k t , put the results equal to zero 
and obtain the following sets of equations: 

?jm u- k p- + r lp . +1 * * • -f r lv k n = 0 


— ?.k x — ?.r l7 ko — 


- - kr 2 „-k 2 - - Xk„- + r p .„. +1 k p . +l + • • . 

(7 ) + V A = 0 

+ r 2 p .^k 2 + • • • + r p . p . + 1 k p . - f,k, +l 

~ /"V+i A = 0 


• • 


r> ° kl + ‘ ‘ ‘ + V A - f‘r p - +lp k R . +l -_ flkp = 0 

This system of linear equations (7) is again very similar to the normal 
5 ^ ns m class,cal mult, P le egression analysis [formula (5) in section 

We multiply the first p equations of system (7) bv k k • • L a 

r w \ also H mu " ip ' y ; h ; '-v AAL AA X 

an ^ sum. The results are: y p+1 ' 


( 8 ) 


( 9 ) 


Taki 

( 10 ) 


1 J',M i I - 0 

■-> “tX, 1 X:« k ‘ k ’- a 

•ng the relations (3), (4), and (5) into account, we have: 

R = ?. = // 





APPLICATIONS OF MULTIVARIATE ANALYSIS 


116 



Both Lagrange multipliers are equal to the canonical correlation coeffi¬ 
cient. 

We eliminate the coefficients k l9 k 2 , * * * k p from the set of equations 
(7). The resulting system of equations is: 

gip'+\kp' + i + £V + 2^t>'+2 + * * * + gi,>k p = Xk Y 


( 11 ) 


gp'p'+i k p' + i gp’p' + %kp’ + 2 “t~ ’ * ’ gp’pp'^v Mtp' 

hp' + i,i^i + h p > + \ 2 k 2 + * * * + hp' + l p.k p ‘ = kk p - + l 


h v\ k i + h v2 k 2 + • * * + hp*k p - = hk 


From this set of equations we derive finally: 


fn k i H f\2 k 2 "F 


+ flv' k p' = & k \ 


( 12 ) 


fp\ k \ + Jp’2 k 2 + 


• • 


~^Jp‘l>’ k p' ~ k * k v' 


where 


p 


Jii 2 gir k rj 

r = p’+ 1 


(/,;= i, 2 , • • •//) 


Equation (12) is a system of linear homogeneous equations. The 
unknowns k x • • • k v > can be found if the determinant of the system 
is equal to zero. They should not all be identically zero, which 
evidently is not a meaningful solution. Hence we have the determinantal 
equation: 

fll ~ 2- 2 f\2 * Zip' 


(13) 


= 0 


fv 1 fv 


J*v' ^ 


This condition determines A 2 , the square of the maximum canonical 
correlation coefficient (10). We take the largest root of the determinantal 
equation (13) since we desire to maximize the canonical correlation. 

The joint distribution of the roots of this equation has been found by 
various authors 2 under the hypothesis that their population value is zero. 
Standard errors have been derived earlier by Hotelling. 


2 R. A. Fisher: “The sampling distribution of some statistics obtained from 
non-linear equations,” Annals of Eugenics , vol. 9 (1939), pp. 238 ff.; Contribu¬ 
tions to Mathematical Statistics (New York, 1950), paper 36. P. L. Hsu: On 







6.4} 


CANONICAL CORRELATIONS 


117 


Inserting the value of / into the previous system of equations (12), we 
find k Y • • • k p -. Taking these solutions into system (11), we get the 
remaining coefficients k p ^ l • • • k p . These provide estimates of the 
cononical variates in the population corresponding to our sample. These 
canonical variates are such that U is most successful in predicting F, and 
V is the best predictor of U. 

It should perhaps be emphasized that these methods do not necessarily 
yield results which can be readily interpreted in terms of economic theory. 
This problem of identification will be discussed in greater detail in the 
last section of this chapter (6.5) and in Chapter 7. 

Hotelling 3 applied canonical correlation methods first to some psycho¬ 
logical data. But he indicated the possibility of applying this method 
to certain economic problems, e.g., the effect of crops of agricultural 
products on their prices. 4 


The two most successful attempts to apply these methods to economic 
data have been made by F. V. Waugh. 5 He studied (1) the relations 
between consumption and prices of various types of meat, and (2) the 
relation between characteristics of wheat and characteristics of hour. 

Example I. We indicate the first analysis {consumption ami prices) as 
follows: Let X x be steer prices and X 2 hog prices, X 3 beef consumption 
and X 4 pork consumption. Then the two canonical variates are 0' 

1.711 \1 X x • 1.54037 X 2 for the prices and F 5.25679 X 3 -f I 5.45684 X 4 
for the consumption. These canonical variates are chosen in such a 
fashion as to maximize the (canonical) correlation between U and F. 
This correlation turns out to be - 0.84666. U is the most successful linear 

combination of the prices to predict F, and Fis the best linear combination 
of the consumption data for predicting U. 

Example 2. In the other example given by Waugh (wheat and flour 
characteristics) let the wheat characteristics be as follows: X x kernel 
texture, X 2 test weight, X :i damaged kernels, X 4 foreign materials, X 
crude protein content. The flour characteristics are as follows: X (i wheat 
per barrel of flour, X 7 ash in flour, X H crude protein in flour, X D gluten 


<. V l vl M n llll ° n ° f r °° tS of ccrtam determinantal equations,*' ibid ., pp. 250 ff 
Wilks: op. cit ., pp. 265 ff. A. M. Mood: “On the distribution of the 
characteristic roots of the normal second-moment matrix," Annals of Mathe- 

ZZZ[ St T ,Us : V ° L 22 (,95,) ’ PP- 266 ^ D. N. Nanda: “Distribution of a 
root ot a determinantal equation,” ibid ., vol. 19 (1948), pp. 47 ff. 

J H. Hotelling: op. cit., pp. 342 ff. 

1 Ibid., pp. 322, 376 ff. 

vol. l0,l94^ a pp h 290 R ff e6reSS ' OnS be '" een SCtS ° f Variab ' eS ’” ^non.crica. 



1,8 APPLICATIONS OF MULTIVARIATE ANALYSIS [6.4 

quality index. The canonical variate formed from the wheat characteristics 

is 67= 0.03902^ + 0.23817^- 0.03172^- 1.18545^ + 0.77554^. 

The canonical variate formed from the flour characteristics is as follows: 
v =- 0.11971 A'g— 13.12015^+ 1.12464^ + 0.05903^. The (canon¬ 
ical) correlation between U and V is 0.909388. This is the highest possible 
correlation between any linear combinations of wheat and flour character¬ 
istics. U may be used to predict V, and V is most successful in predicting U. 

Example 3. In the following example 6 we will try to determine the 
relationship between certain price indices and some production indices by 
the method of canonical correlation. The data are the following: X 1 
is the index of production of manufactured durable goods, X 2 of non¬ 
durable goods in the United States. X 3 is the production index of 
minerals, and X 4 the index for agricultural products. All these indices 
are given annually for the base 1935-39 = 100. They have been taken 
from the publications of the Federal Reserve Board, except for X 4 , which 
comes from the Department of Agriculture. These production indices 
form the first group (p = 4). 

The yearly price indices, all given for the base 1926 = 100, are taken 
from the publications of the Bureau of Labor Statistics. All are wholesale 
prices. X 5 denotes farm prices, X 6 food prices, and X 7 other prices 
(p = 7). The period covered by all these indices is 1919-39. They are 
annual data given for TV = 21 years. 

The matrix of the correlation coefficients is given in Table I. 


TABLE 1 

Correlation Matrix 



X , 

*2 

• 

X 3 

X\ 

A's 


*7 

*1 

“ 1.000000 

0.495941 

0.872836 

0.481240 

-0.436385 

-0.427250 

-0.203390“ 



1.000000 

0.768279 

0.709807 

0.425728 

0.429576 

0.584220 ' 

*3 



1.000000 

0.712358 

-0.038273 

-0.043762 

0.138680 

X x 




1.000000 

0.261010 

0.267098 

0.378452 

X h 





1.000000 

0.987285 

0.904598 

X 6 






1.000000 

0.914394 

x. 







1.000000 


We want to find two linear functions (canonical variates): 

(14) CJ = k l z l + k 2 z 2 + k 3 z 3 + k 4 z 4 
and 

( 1 5) V — k b z b + k b z 6 + k-z 7 


6 G. Tintner: “Some applications of multivariate analysis to economic data,” 
Journal of the American Statistical Association , vol. 41 (1946), pp. 487 ff. 



6.4] 


CANONICAL CORRELATIONS 


119 


The variances of these two functions U and V should be one and their 
correlation a maximum. 

The conditions which must be fulfilled by these coefficients A, • • ■ A 7 
are from formula (7): 


- 1.000000AA,- 0.495941AA 2 - 0.872836AA 3 - 0.48I240AA, 

- 0.436385A S - 0.427250A 6 - 0.203390A, = 0 

- 0.495941 AA, - I.OOOOOOAA 2 - 0.768279AA 3 - 0.709807AA 4 

+ 0.425728A S + 0.429576A 6 + 0.584220A. = 0 

— 0.872836AA, — 0.768279AA 2 — 1.000000AA 3 - 0.7I2358AA, 

- 0.038373A 5 - 0.043762A g + 0.I38680A, = 0 

(16) - 0.481240AA, - 0.709807AA 2 - 0.712358AA a 1.000000A A 4 

+ 0.261010A 5 + 0.267098A e + 0.378452A 7 = 0 

— 0.436385A, + 0.425728A, - 0.038273A 3 + 0.2610I0A 4 

- 1.000000/#A s - 0.987285//Ag- 0.904598//A, = 0 

- 0.427250A, + 0.429576A, - 0.043762A 3 + 0.267098A 4 

- 0.987285//A s — 1.OOOOOOi/Ag — 0.9I4394,«A 7 = 0 

— 0.203390A, * 0.584220A 2 + 0.138680A 3 + 0.378452A, 

- 0.904598//A 5 — 0.914394//A 6 — 1.000000, uk 7 = 0 

Solving ihese equations, we have the following system from formula (11): 
^i= 1.25811 7Aj+ I.I36323A S + 0.696786A- 

AA 2 = - 0.557767A S — 0.615021A 6 — 0.831246A 7 
AAa =■ - 0.601224A 3 - 0.4I9682A B - 0.040741 A, 


(17) 


?.k s 

?.k 6 

Xk, 


0.042272&- 

• * 

0.660727A, 
0.826393A, 
1.149933A, 


0.078433Ag 0.094727A, 

0.021638A 2 0.115510A 3 + 0.166631 A, 
0.617089A 2 + 1.153395A, + 0.3I8826A, 
1 .168042A 2 - 1.088823A., - 0.820701A, 


Finally we get. by substituting again from formula (12): 

0.881606A 5 + 0.772900A 6 + 0.43I320A 7 A 2 A 5 
001 1 4 1 9k^ + 0.050461Ag — 0.0I4326A 7 - A 2 A„ 
- 0.I05935A, - 0.066995A o + 0.291 776A ? - A 2 A 7 
Using the iteration methods developed by Hotelling' 


and the 


7 H. Hotelling: op. r/7., pp. 342 ft'. 



120 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.4 


computation schemes of Waugh 8 (Appendix A.2.4), we get finally the 
following results: 

(19) U = 1.094989Z! - 0.371620z 2 - 0.587650z 3 - 0.020964z 4 

(20) V = 1.000000z 5 - 0.011424z 6 - 0.215485z 7 

Computational matters are treated in Appendix A.2. These results are 
given for the standardized variables z,. 

The (canonical) correlation coefficient between these two linear func¬ 
tions of the quantities produced U ( 19) and the prices V (20) is 0.8831. 
U and V are chosen in such a manner that they have the highest possible 
correlation with each other. 

This also can be expressed in the following way. Our U is the linear 
combination of the various production indices which is most successful 
in predicting the general price “index” V. At the same time V is the 
linear combination of price indices which is best in order to predict the 
general production “index” U. 

Needless to say, our results are to be interpreted with a certain amount 
of caution. After all, we are dealing here only with a few production 
and price indices for rather broad categories of goods and not with the 
production and price data for individual commodities. We can never¬ 
theless reach some tentative conclusions. Our method does not imply 
that we obtain necessarily economically meaningful (“structural”) relation¬ 
ships. The problem of estimating structural relationships will be discussed 
in the following section, 6.5, and in Chapter 7. 

The first index U (19) shows that in trying to estimate the mutual inter¬ 
dependence between production and prices the largest weight has to be 
given to the production of durable goods. This agrees with the ideas of 
many students of the business cycle. 9 The weight given to production 
of minerals is also quite important but negative. Agricultural products 
seem to play only a very insignificant part. It is especially the weighted 
difference between the movements of the production of durable' goods 
and the production of minerals which appears to be decisive. This 
points in the direction of certain business cycle theories, especially those 
stressing the different behavior of various producers' and consumers 
goods in the cycle. 10 

8 F. V. Waugh: op. r/7., pp. 301 ff. 

9 G. von Haberler: Prosperity and Depression (Geneva, 1939), pp. 29 ff., 
279 ff. J. Tinbergen: The Dynamics of Business Cycles (Chicago, 1950), pp. 

171 ff. 

10 G. von Haberler: op. cit. y pp. 29 ff. H. S. Ellis: German Monetary Theory 
(Cambridge, Mass., 1934), pp. 335 ff. 



6 . 5 ] 


WEIGHTED REGRESSION 


121 


Highest weight in V (20) is given to farm prices and some weight also 
to the index representing all other prices. But this weight is negative. 
It is probably due to the fact that many miscellaneous prices are contained 
in this category that the influence is here considerably smaller. Food 
prices seem not to play a very important role in the determination of V. 
This is again in agreement with some business cycle theories. 11 It is 
interesting to note that it is the weighted difference between farm prices 
and prices of other commodities which appears to be decisive. This 
seems to point in the direction of certain business cycle theories. 12 

6.5 Weighted Regression 

Ordinary multiple regression (section 5.1) tries to estimate the relation¬ 
ship between a “dependent" variable and a set of “independent” variables 
in such a way as to make the prediction of the dependent variable most 
successful. This was discussed in section 5.1. The sum of squares of 
the deviations from a linear combination of the fixed values of the inde¬ 
pendent variables becomes as small as possible (method of least squares). 
This assumes that we want to predict the dependent variable most 
accurately for fixed values of the “predictors/’ i.e., the independent 
variables. 1 

This method evidently breaks down if we are interested not in prediction 
under the assumption that the fundamental relationships are unchanged 
but in the establishment of “structural" relationships existing in the 
population, and if we also assume that all variables are subject to disturb¬ 
ances or errors. For theoretical purposes and also for purposes of 

economic policy this is most important, as Haavelmo has shown 2 (section 

1 . 2 ). 

Structural relations are, for instance, demand functions, supply func¬ 
tions, production functions, the consumption function. Structural co¬ 
efficients are the parameters which determine the special form of these 


11 G. von Haberler: op. cit ., pp. 151 flf. 

12 G. Tintner: Prices in the Trade Cycle (Vienna, 1935). F. A. von Hayek: 

Preise and Produktion (Vienna, 1931); Prices and Production (London. 1931). 

A. H. Hansen: Business Cycles and National Income (New York, 1951), pp. 
384 flf. ’ * * * 

H. Hotelling: “The selection of variates for use in prediction with some 

comments on the general problem of nuisance parameters,” Annals of Mathe¬ 
matical Statistics , vol. 11 (1940), pp. 271 ff. 

~ T. Haavelmo: “The statistical implications of a system of simultaneous 
equations, ’ Econometrica , vol. 11 (1943), pp. 1 ff.; “The probability approach 
in econometrics,” ibid. , vol. 12 (1944), supplement. 



122 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


relationships. They are, for instance, price elasticity of demand, income 
elasticity of demand, marginal cost, propensity to consume. 

Consider, e.g., the following situation: There is an isolated market 
where the equilibrium price and the equilibrium quantity are determined 
by the intersection of the demand function and the supply function. We 
have a series of observations on the prices and quantities. 

The simple regression of the quantity on the price will give us a relation¬ 
ship which enables us to predict most successfully the quantity sold if 
the price of the commodity is given. The simple regression of the price 
on the quantity gives another relation which may be used to predict as 
accurately as possible the price of the commodity, if the quantity is given. 
These relationships hold only if other things are equal. They are not 
structural, nor are structural relationships (demand and supply functions) 
needed if we want to make predictions of the type indicated. 

But consider now a situation in which the government intervenes on 
the market. Assume, for instance, that the government fixes the price 
of the commodity. That is to say, one structural relationship (the supply 
function) is supplanted by the government action. In order to evaluate 
the effect of price fixing we must now know the demand function, which 
is a structural relationship. The elasticity of demand, a structural para¬ 
meter, must be known if we want to form an idea about the following 
problem: by how many per cent will the quantity demanded decline, 
ceteris paribus , if the fixed price increases by I per cent? 

This and similar questions are evidently of great importance for econ¬ 
omic policy. But in order to answer such questions we must form esti¬ 
mates of the structural relationships and estimate structural parameters. 
The simple regression of the price on the quantity, or of the quantity on 
the price, cannot in general supply these estimates. 

We adopt, for instance, the following stochastic scheme: Not only the 
dependent variable but all variables in the system contemplated are 
subject to error (Frisch). 3 There are, however, no errors in the equations. 
These will be treated in Chapter 7. We do not want to predict one of 
the variables for fixed values of the others but we want to estimate the 
structural relationships themselves, i.e., the regression coefficients of the 


3 R. Frisch: Statistical Confluence Analysis by Means of Complete Regression 
Systems (Oslo, 1934). G. Tintner: “A note on the economic aspects of the 
theory of errors in economic time series," Quarterly Journal of Economics, 
vol. 53 (1938), pp. 141 ff. O. Reiersol: "Confluence analysis by means of lag 
moments and other methods of confluence analysis,” Econometrica , vol. 9 
(1941), pp. 1 ff. T. W. Anderson and L. Hurwicz: "Errors and shocks in 
economic relationships," ibid., vol. 17 (1949), supplement, pp. 23 ff. 



6.5) 


WEIGHTED REGRESSION 


123 


“weighted" regression equation. Estimation methods different from 
those given here have been developed by Geary 1 and Wald. 5 

The estimation of structural relationships is most important in con¬ 
nection with problems arising in economic policy. This will be illustrated 
later, especially with the help of Examples 3 and 4. 

The problem of distinguishing the various economically meaningful 
relationships, e.g., demand functions and supply functions, is also very 
important. This is the problem of identification. This problem will be 
discussed in section 7.1. 

The method of weighted regression was developed by Koopmans 0 on 
the basis of earlier work by many authors, 7 among them Rhodes 8 and 
van Uven° are the most important. The problem of identification 10 will 
be discussed in more detail in connection with Example 3 below and in 
section 7.1. 

In the most general form we can pose the problem in the following way: 


4 R. C. Geary: “Determination of linear relations between the systematic 
parts of variables with errors of observations the variances of which are un¬ 
known," Econometrica , vol. 17 (1949), pp. 30 fT. 

5 A - Wald: “The fitting of straight lines if both variables are subject to 
error,“ Annals of Mathematical Statistics , vol. II (1940), pp. 284 ff. M. S. 

Bartlett: “fitting a straight line when both variables are subject to error," 
Biometrics , vol. 5 (1949), pp. 207 ff. 

<J T. C. Koopmans: Linear Regression Analysis in Economic Time Series 
(Haarlem, 1937). 

K. learson. On lines and planes of closest fit," Philosophical Magazine , 

series VI, vol. 2 (1901), pp. 559 ff. C. Gini: “Sulf interpolazione di una retta 

quando i valore della variable independente sono affetti da errori accidental!," 

Metron , vol. I (1921), pp. 63 ff. G. Pietra: “Interpolating plane curves," ihid. % 

vol. 3 (1924), pp. 31 I ff. C. F. Roos: “A general invariant criterion of fit for 

lines and planes where all variables are subject to error," ibid. , vol. 13 (1937), 

pp. 3 ff. W. E. Deming: Statistical Adjustment of Data (New York 1943) 
pp. 128 ff. 

8 E. C. Rhodes: “On lines and planes of closest fit," Philosophical Magazine 
series 7, vol. 3 (1927), pp. 357 ff. 

M. J. van Uven: “Adjustment of V points (in //-dimensional space) to the 
best linear (/V- 1 )-dimensional space," Koninklijke Akademie van Wetenschapen 
30 ^ SUnlanK />r(,cec dings of the Section of Science , vol. 23 (1930), pp. 143 ff., 

L. R. Klein: “Pitfalls in the statistical determination of the investment 
schedule," Econometrica , vol. II (1943), pp. 246 ff. O. Reiersol: “Identifi- 
abilUy of ''near relations between variables which are subject to error," ibid. 
vol 18(1950), pp. 375 ff. G. Tintner: “Die Identifikation: ein Problem der 

ekonometrie, Statistische Vicrtcljahrsschrift , vol. 3 (1950), pp. 7 fT. 



124 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


Assume that we have R meaningful (identified) linear relationships 
between the p economic variables Mp. 

v 

0) + 2 k vj M jt = e vt (v = 1, 2, • • • R; t = I, 2, • • • N) 

j=i 

where k v0 , k vl , k v2 , • • • k vp are “structural” coefficients and e vi is a 
random term. It represents the variables which have not been included 
in the equation. They give rise to the errors in the equations which are 
here neglected but will be discussed in Chapter 7. 

But actually we don’t observe the “true” variables Af 1( • • • M vi but 
the empirical variables X it (t = 1, 2, • • • N). We have N observations 
on each variable. Let us assume that the systematic part M it is the 
mathematical expectation of X it and the rj it are the random disturbances: 

X u = M lt + r] lt 

( 2 ) . 

X pl = M pt + rj pt (/= 1,2, • • • N) 

We assume that the “disturbances” or errors rj it are normally distri¬ 
buted. They arise as errors of measurement, 11 from lack of representa¬ 
tiveness of the’ empirical variables X it , from frictional causes, 12 etc. It 
has been proposed to call the rj it disturbances or errors in the variables 
and the e vt disturbances or errors in equation. 

There are two possibilities in dealing with this situation represented 
by (1). We can neglect either the disturbances rj it or the random term 
e ct . This random term results from variables not included in the analysis 
and similar causes. The first approach underlies Haavelmo's, 13 Wald’s, 14 
and Marschak’s 15 work. It will be treated in Chapter 7. The second 
assumption is the basis of the fundamental scheme of Frisch 16 and the 


11 D. Brunt: The Combination of Observations (Cambridge, 1931), pp. 1 ff. 
O. Morgenstern: On the Accuracy of Economic Observations (Princeton, 1950). 

12 G. Tintner: “A note on economic aspects of the theory of errors in time 
series,” Quarterly Journal of Economics, vol. 53 (1938), pp. 141 ff. 

13 T. Haavelmo: “The probability approach in econometrics,” Econometrica , 
vol. 12(1944), supplement. See also T. C. Koopmans: “Statistical estimation of 
simultaneous economic relations,” Journal of the American Statistical Associa¬ 
tion , vol. 40 (1945), pp. 448 ff. 

14 H. B. Mann and A. Wald: “On the statistical treatment of linear stochastic 
difference equations,” Econometrica , vol. 1 1 (1943), pp. 173 ff. 

15 J. Marschak and W. H. Andrews: “Random simultaneous equations and 
the theory of production,” Econometrica , vol. 12 (1944), pp. 143 ff. 

16 R. Frisch: op. cit. See also R. Stone: “The analysis of market demand,” 

Journal of the Royal Statistical Society, vol. 108 (1945), pp. 1 ff- 




6 . 5 ] 


WEIGHTED REGRESSION 


125 


weighted regression analysis developed by Koopmans. 17 We are going 
to deal here only with the second case. Our equations (I) become: 




v 


1 k ri M„ 






We assume that the variance-covariance matrix of the errors in the 
variables (disturbances) rj lt , rj 2t , • • • ij pt is known and denote it by [ V l} \. 
This matrix will frequently not be known, but we may be able to estimate 
it, e.g., by the variate difference method. 18 This procedure will be dis¬ 
cussed in section 11.2. The matrix is independent of /. Its inverse is 
[y ,J ] = (Appendix A.l). 

Our purpose is not prediction but the estimation of the structural 
coefficients k r() , k rl , k r2 , • • • k vp . 

The method of maximum likelihood leads to the method of least squares. 
We have to minimize the sum of squares: 

(4) Q = | Q, 

! - 1 

where: 


( 5 ) 0,1 1 ™„)( x jt 0 I, 2, • • • .V) 

I — I j = 1 

The x it and m, t are the deviations of X lt and M it from their respective 
arithmetic means. 


To normalize and orthogonalize the structural coefficients k vl (r = 1, 
2, • • • R\ i = 1,2,- • • p) we introduce the following relations: 




(v, w 



where d vu . is the Kronecker delta. It is 1 for v 
Now we introduce the new' function: 

/< 


(7) 

where 


F, = 0, 


_ y 

r 1 




ft vl are Lagrange multipliers. 



iv, zero otherwise. 



17 T. C. Koopmans: Linear Regression Analysis of Economic Time Series 
(Haarlem, 1937). G.Tintner: “An application of the variate difference method 
to multiple regression,” Econometrica , vol. 12 (1944), pp. 97 ff.; “A note on 
rank, multicollinearity and multiple regression,” Annals of Mathematical Statis¬ 
tics, vol. 16 (1945), pp. 304 ff.; “Multiple regression for systems of equations,” 
Econometrica , vol. 14 (1946), pp. 5 ff.; “Die Identifikation: ein Problem der 
Oekonometrie,” Statistische Vierteljahrschrift , vol. 3 (1950), pp. 7 ff. 

18 G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940). 



126 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


First we differentiate (7) with respect to the m it (/= 1, 2, 
t = 1, 2, • • • N). We set equal to zero and derive the relations: 


[ 6.5 


PI 


( 8 ) 


2 V»(x„ - =f I /<K( Ar 

j=l V=1 j=l 


(/ = 1, 2, • * • / 7 ) 


This system of linear equations can be solved for x i( — m it : 


(9) x it -m it = 2 lUvtVakvi 

v =1 j =1 


(/'= 1, 2, •••/>;/= 1, 2, 


TV) 


We multiply (9) by &„,■ and sum with respect to /'. The result is 


( 10 ) 


Pvt = 2 k vi x„ (v = 1, 2, • • • R; t = 1, 2, • • • N) 

j= 1 


Hence we obtain: 


(ID 


Qt = 2 ,« 2 w = 2(2 


r = 1 


r=l i=l 


We impose another set of normalizing and orthogonalizing conditions 


( 12 ) 


Rvtf^tCt hfl V 0 




/= 1 


Now we have to maximize Q (4) under conditions (6) and (12). We 
introduce new Lagrange multipliers ol vw and ft vv . (v, w = 1,2,- • • /?) and 
form the new function: 

< l3 > G=Q+ 2 2 Pv,J'v,c— 2 2 a ^no 


r = 1 m? = 1 


f= 1 tv = 1 


Because of considerations of symmetry we have cc rw = <x wv and fi rtr 
= ftwv ( v * w = U 2, • • • R). Also = 0. 

Now we differentiate with respect to k vi (r = 1, 2, • • • R; i = I, 2, 
•••/?) and set the result equal to zero. We obtain the following system 
of equations after dividing by N — 1 : 

v it P 

(14) 2 k vPiS + 2 Pva, 2 k wflii = 

j = 1 tv = 1 j= 1 

2 2 (>’ = I, 2, ■ • -7?: /' = 1,2,- • • />) 

M = 1 j= 1 

We multiply by A: r , and sum with respect to /. We also multiply by 
(z ^ v) and sum with respect to /. Taking relations (6) and (12) 
into account, we derive: 


(15) 


2 Rvt 2 = O' = 1» 2, 

i 


/?) 


(16) 


= Pvt = 0 (* ^ V) (V, Z = 1, 2, * * * R) 



6 . 5 ] 


WEIGHTED REGRESSION 


127 


Putting ?. v = ol vv /(N — 1), system (14) becomes: 

(1 7 ) 2 iflij— AvKj)kvj = 0 (v = 1, 2, • * • R; i = 1, 2, • • • p) 

j=\ 

This is again a homogeneous system of linear equations. Its similarity 
to the normal equations of multiple regression theory [section 5.1, formula 
(5)] should be noticed. 

The constants k v0 (v = I, 2, • • • R) are given by the condition that 
the best fit must go through the means of our variables: 

(18) + = 0 (v= 1,2, • • • R) 

j= i 

To solve the homogeneous linear system (17) its determinant must be 
equal to zero: 



* 11 “ 

).V n 

a 12 ~ 

^12 

• • • 

a u>- 


( 19 ) 

a 2 1 - 

• • • 

;.y 21 

• • • • 

^22 

?. v 22 

• • • • 

• • • 

• • • • 

a 2p — 

XV 2p 

• • • 


a ,.\ - 


& p2 


• • • 

a PP 



The smallest root of the determinantal equation (19) can be shown to 
be the minimum of Q divided by N — 1. 

We can also use the determinantal equation (19) for a test for multi - 
co/linearity. We form the test functions: 

(20) A r = (N- 1 )£;., 

i = i 

where is the smallest root of equation (19), / 2 the next smallest, etc. 
Then we can use the theory of Hsu 19 in order to estimate the number of 
independent linear relationships between M Uy M 2t> • • • M pt in the popu¬ 
lation which corresponds to our sample. It can be shown that under 
certain conditions A r is distributed like yf with (N — 1 — p y- r)r degrees 
of freedom. 

The test proposed is, however, based upon the assumption that the 

9 p ; L - Hsu: “On the problem of rank and the limiting distribution of 
Fisher's test function," Annals of Eugenics , vol. 11 (1941), pp. 39 flf. See also 
G. Tintner: “A note on rank, multicollinearity and multiple regressions," 
Annals of Mathematical Statistics , vol. 16 (1945), pp. 304 ff. R. C. Geary: 
“Studies in the relations between economic time series," Journal of the Royal 
Statistical Society , supplement, vol. 10 (1948), pp. 132 ff. T. W. Anderson: 

“The asymptotic distribution of the roots of certain determinantal equations " 
ibid., pp. 190 ff. ^ 




128 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


variances and covariances of the errors V u are actually known. In 

practice they have to be estimated from the data by various methods. 

Hence the results can be considered only very rough approximations, as 
Bartlett has shown. 20 


The approximate ^-distribution is not exactly valid. Another approxi¬ 
mation has been given by Bartlett. 21 It can be shown that for large N 
the two approximations agree. If the systematic parts have been approxi¬ 
mated by orthogonal functions (section 8.1), the ^-approximation holds, 
as Geary has demonstrated. 22 

Another large sample distribution has been established by T. W. 
Anderson. 23 He shows that in the limit, i.e., for large TV, the quantity: 





V 2 Nr 


is normally distributed with mean zero and variance 1. 

The problem of multicollinearity is again a multiple choice problem 
(section 1.2). We want to estimate the number of linear independent 
relations which probably exist between the systematic parts of our vari¬ 
ables in the hypothetical infinite population from which our sample is 
taken. 

We proceed as follows: Let us form the test functions Aj, A 2 , • • •. 
We find that A I{ is not significant at a level of significance selected in 
advance but that A /m is significant. Then we may conclude that there 
are probably R linear independent relations between the systematic parts 
of our variables in the population which corresponds to the sample in 
question. It should be emphasized that this test is only asymptomatically 
valid for large samples. 

The test indicated above may also be used to test the number of relation¬ 
ships probably existing between the systematic components of any subset 
of our p variables. Suppose that tests convince us that there is exactly 
one linear relationship existing in all probability between the systematic 
parts of the variables X Jy X 2y • • • X pl . For the sake of convenience we 
write the relationship as follows: 

(22) M lt = Ar 0 T k 2 M 2( r * * * 4- k P M p - t 


20 M. S. Bartlett: “A note on the statistical estimation of supply and demand 
functions from time series," Eeononietrica , vol. 16 (1948), pp. 323 ff. 

21 M. S. Bartlett: "Multivariate analysis," Journal of the Royal Statistical 
Society , supplement, vol. 9 (1947), pp. 176 ff. 

22 R. C. Geary: "Studies in the relations between economic time series," 
Journal of the Royal Statistical Society, vol. 10 (1948), supplement, pp. 132 ff. 

23 T. W. Anderson: op. cit. 



6.5] 


WEIGHTED REGRESSION 


129 


We form a determinantal equation similar to (19) for the p variables 
A'i, X 2 , • • • X v - . Let Xf be the smallest root of the determinantal 
equation. The weighted regression coefficients k 2 \ Ar 3 ', • • • lc' p > can be 
computed from the following system of linear equations: 

(&22 ^ 22^2 “b (^23 ^1 ^ 23)^3 +***-}- 

( a 2 p ^1 ^2p')^ p’ — a l2 21 ^12 

(°23 ^1 ^23)^2 “b (^33 ^*1 ^33)^3 + * ‘ ’ “b 

(^3p' ~ ^3j/)^ p' — ^13 — 13 


(^2p' ^T^2p')^2 4- (#3 P ' ^i ^3 P ')k 2 T ' * * T" 

K'p' ~ ^ l i Vp‘p')k v• = a ip' ~ X 1 L 1;) - 

The constant term /r 0 ' is found by a relationship analogous to (18). 

We can also apply a test of significance for the individual k/ given by 
Koopmans. 24 This test is again only approximate. 

Denote by c u the element of the inverse of the matrix used to compute 
the k- in (23). This inverse may be computed by the methods given by 
R. A. Fisher. 25 Computational methods are given in Appendix A.2. 
Then the square of the standard error of the coefficient k n which we 
denote by s iy is given approximately by: 


(24) 

where kf 



2 2 Kk; v ns 

>1 = 1 s — 1 



The ratio /://$, is approximately distributed like Student’s / with N — p 
degrees of freedom. The / computed in this fashion may also be used 
to establish fiducial or confidence limits for the weighted regression 
coefficients and to test hypotheses. The covariance of k- and k :/ may 
be computed by similar methods. J 

In econometric problems it is frequently necessary to test the existence 
of linear relations between the weighted regression coefficients. 26 


24 T. C. Koopmans: Linear Regression Analysis in Economic Time Series 
(Haarlem, 1937), p. 80. 

R. A. Fisher: Statistical Methods for Research Workers (8th ed New 
York, 1941), pp. 150 ff. 

26 • 

u. lintner: “A test for linear relations between weighted regression co- 

e cients, Journal of the Royal Statistical Society , series B, vol. 12 (1950) 
pp* 273 Of• 





130 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


Suppose we postulate now that apart from the normalizing condition 

(25 > g*w = I 2 *,**,* V„ = 1 

1-1 j =1 

the weighted regression coefficients kf, k 2 *, ■ ■ ■ k„* are also subject to 
a linear relationship: 

(26) L* = 2 Lfif = 0 

i=i 

We denote the new coefficients by k,* (/' = I, 2, • • • />) to distinguish 
them from the old ones. L u L 2 , ■ ■ ■ L p are given constants. We have 
now to maximize a function Q analogous to (4) under a condition g* vl , 
(25) analogous to (6) and also under the new condition (26). We intro¬ 
duce new Lagrange multipliers ?* and // and form the functions: 

(27) H* = F* — X*g*„ + f ,L* 

Differentiating (27) partially with respect to k t * (/ = 1, 2, • • • p) and 
setting the derivatives equal to zero, we have the system of linear equa¬ 
tions, which includes condition (26): 


(28) 



+ f'Lj = 0 




V 

x—' 
> 

j=i 


Ljkj* 




This is again a system of linear and homogeneous equations in the 
p -f- I unknowns: Arj*, • • • k p *, //. In order to achieve a non¬ 
trivial solution the determinant of system (28) must be equal to zero: 




* 

i 

*12 

-k*v v2 ••• 

*i p 

- ?* 

Vx, 

2-1 


*21 

- 2* V 21 

*22 

_ j* v ... 

'* r 22 

*2 p 

- ?* 

V%, 

l 2 

(29) 

• 


• • • 


• • • • 

• • • 

• • • • 

• • 


*/> 1 

- 2 * y pl 

* i>2 

_ ;*y ... 

A V p2 

*p;> 

- 2* 





L\ 


I ... 

2-2 


L, 


0 

Multiplying (28) by k, 

and adding with 

respect 

to /, 

taking 

also 


= 0 


(25) and L* (26) into account, we see that: 



Hence it follows that we have to choose the smallest root of equation 
(29). Substituting it into system (28) and taking a condition analogous 
to (6), namely g* vv = 1 (25) into account, we obtain properly normalized 




6 . 5 ] 


WEIGHTED REGRESSION 


131 


values of k x * y k 2 *, • • • k p *. These are the estimates of the linear rela¬ 
tionships existing between the systematic parts of the variables in the 
population, achieved under the linear condition (25). The computation 
of the weighted regression coefficients is not necessary for the test which 
is described in what follows. 

For large samples, A x = (yV — I)/-! will be distributed approximately 
like y 2 with N — p degrees of freedom. It should be recalled that is 
the smallest root of equation (19). 

For large samples, Aj* = (N — 1)/.,* will be distributed approximately 
like z 2 with ;V — p + 1 degrees of freedom. X x * is the smallest root of 
equation (29). This follows by analogy to a theorem given by Wilks. 27 

Hence a test function can be constructed, which will give an answer 
to the following question: Is the sum of squares A x * for the fit achieved 
under the linear condition (26) sufficiently larger than the sum of squares 
Aj for the fit achieved without this condition, so that the difference could 
not have arisen by chance? 

There are two possible tests: (a) We form the difference: 

(31) 

This quantity is, under the null hypothesis, approximately distributed 
like ■/} with I degree of freedom. ( h ) We form the test function: 

(32) F' = (V- A-iW-p) 

■ 31 

F' is Snedecor’s F (variance ratio) and is approximately distributed 
with I and N p degrees of freedom. Choosing an appropriate level 
of significance, say 5 per cent or I per cent, we can use it to test the null 
hypothesis that the difference between the sums of squares A t * and A 
has arisen by chance. The null hypothesis probably does not hold in the 
population corresponding to our sample if F' is large. It seems that the 
second test is preferable if the variance-covariance matrix of the errors is 
not known but has been estimated from the data. 

Example I. The method of weighted regression was applied by Koop- 
mans 28 to the world ship freight market for the period 1880-1911. 

Let X x be the freight index (1900 100), X 2 transport in billions of 

ton-miles, X 3 tonnage in millions of tons, X 4 coal prices in shillings per 
ton. All these are expressed in percentages of the trend. The weiahted 

V- 


S. S. Wilks: Mathematical Statistics (Princeton, 1943), pp. 171 ff 
28 T. C. Koopmans: Linear Regression Analysis in Economic Time Series 

(Haarlem, 1937), pp. 115 ff. 



132 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


regression equation is m 1 = 0.6 6m 2 + 0.29 m 3 + 0.46m 4 . No effort is 
made to identify this result with a meaningful economic relation. 

Example 2. In order to compute these weighted regression coefficients 
Koopmans had to assume somewhat arbitrarily a set of weights. We have 
endeavored to estimate these weights by the variate difference method 
(section 11.2) in a study of agricultural production in the United States, 

1920-41. 29 

Let X x be the logarithm of the volume of agricultural production, X 2 
the logarithm of employment in agriculture, X 3 the logarithm of operating 
capital, and X 4 time. The weighted regression equation appears then 
as m x = 2.7735 m 2 + 0.9020 m 3 -f- 0.0087m.,. The first two coefficients 
of the “Douglas-type" production function (section 3.4) are elasticities 
with respect to labor and operating capital, while the third coefficient 
represents an exponential trend. It appears, for instance, that an increase 
of agricultural employment by 1 per cent will result in an increase of 
agricultural production by about 2.8 per cent. An increase of 1 per cent 
in operating capital will bring about an increase of agricultural production 
by about 0.9 per cent. 

Example 3. We have also applied the method of weighted regression 
in an attempt to find a static demand and a static supply function for 
agricultural products in the United States. 30 This analysis was also 
discussed in section 3.2, Example 2. 

Denote by X x prices received by farmers for agricultural products, by 
X 2 national income, by X 3 agricultural production, by X 4 time (origin 
between 1931 and 1932), and by X 5 prices paid by farmers. The data 
were given annually for the 24 years 1920-43. 

An analysis of the data by weighted regression necessitates the esti¬ 
mation of the error variances. These have again been established by the 
variate difference method. This procedure is possible if we assume that 
the systematic parts of our variables are “smooth" functions of time 
(see section 11.2.1). 

An investigation of the problem by means of the method explained above 
shows that there are probably two linear relationships between the five 
variables. Other tests show that there is probably one relationship 
between the systematic parts of variables, X v X 2 , X 3 , and X A and one 
between the systematic parts of variables A',, X 3 , X 4y and X 5 . The first 
is evidently the demand function and the second the supply function. 

29 G. Tintner: “An application of the variate difference method to multiple 

regression,” Econometrica , vol. 12 (1944), pp. 97 ff. 

30 G. Tintner: “Multiple regression for systems of equations," Econometrica , 

vol. 14 (1946), pp. 5 ff. 



6 . 5 ] 


WEIGHTED REGRESSION 


133 


It should be noted that the inclusion of the systematic part of X., (national 
income) in the first set and of the systematic part of X 5 (prices paid by 
farmers) in the second set serves to make the relationships economically 
meaningful. In this way we identify the first weighted regression equation 
as the demand function and the second as the supply function. 

Denoting deviations from the means of the systematic parts M by 
w,-, we have for our estimate of the demand function: 

( 33 ) '"3 = ~ 0.097/Mj + 0.424m 2 -f 0.313 m x 


The estimate of the supply function is: 


m,, = -f 0.809m 4 - 3.61 1 m b 


(34) 

It appears from statistical tests that the results for equation (33) are 

more reliable than tor (34), as we should expect. Agricultural supply 

depends largely on the weather and other factors not included in the 

analysis. It is also possible that there are dynamic relationships. All 

these have been neglected. We have also neglected errors in the equations 

(Chapter 7). Methods for fitting dynamic relationships will be indicated 
in sections 10.3.4 and 10.3.7. 

We can compute elasticities from these equations which are based 
upon the means of the variables over the period. The price elasticity 
of demand estimated from the first equation is - 0.123. That is to say, 
other things being equal, an increase of I per cent in the prices of agri¬ 
cultural products results in a decrease of about 0.1 per cent in the quantity 
demanded. I he income elasticity of demand is 0.309. 

Fiducial or confidence limits can also be established for these elastici¬ 
ties. They appear for the price elasticity of demand as 0 052 and 
r 0. | 95 at the 5 per cent level. A test of significance shows that it 
is highly probable that the income elasticity of demand is ereater 
than the price elasticity. The importance of these tentative results for 
economic policy is obvious (sections 1.1 and 1.2, and 3.2, Example 2) 
These results or more reliable results of a similar nature ought to be 

fixes",h' nt ° aCC ° Unt ' f ‘ he e° vernmcilt la *es agricultural commodities, 
iixcs their prices, etc. 

b/Tt; 4 ;, K ' S ' a 0, o aX:, ‘ meth ° dS Similar l ° the ones Reposed 

US ; F ° r t,le P cnod l926 -^ he obtained demand and supply functions 

H Engla o d - The price Claslltl, y of ‘he demand for cotton 
s imated as 0.24 and the income elasticity as 0.88. The price 





134 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


elasticity of the supply of cotton yarn is estimated as 2.66 and the cost 
elasticity of supply as —7.67. 

A demand function for rayon in England has also been investigated on 
the basis of data from the period 1926-38. The price elasticity of the 
demand for rayon is —0.97 and the income elasticity is 1.67. 

Lomax’s data have also been used by M. S. Bartlett, 32 who applied a 
modification of the method of weighted regression to estimate the demand 
and the supply functions for cotton yarn. Denote by x x home consumption 
of cotton yarn, by x 2 price level of cotton yarn, by x 3 cost of production 
of cotton yarn, by x 4 income in real terms. All variables are logarithms. 

The demand function for cotton yarn appears as: 

(35) tn 1 = — 0.310aw 2 + 1.029/w 4 . 

The estimate of the price elasticity of the demand for cotton yarn is 
—0.301; the income elasticity is estimated as 1.029. The supply function 
for cotton yarn appears as: 

(36) m l = 5A19m 2 — 14.469m 3 

An investigation of the confidence regions for the weighted regression 
coefficients indicates that the demand function can be estimated with 
some accuracy, but not the supply function. This result is similar to the 
difficulties which arose in Example 3. 

Example 5. We propose to illustrate the method of weighted regression 
further by an example. We will endeavor to fit a static production function 
for the whole American economy in the period 1921-41, using yearly 
data for these N = 21 years. 33 This is mainly an effort to continue the 
work of Paul Douglas and his collaborators 34 (section 3.4). 

Xj denotes the logarithm of labor in the United States, both industrial 
and agricultural labor in millions of workers. X 2 is the logarithm of 
total stock of fixed capital in the economy measured in billions of 1934 
dollars. X 3 is the logarithm of total private net output also in billions 
of 1934 dollars. X 4 is time measured from 1931 as origin. X x is taken 
from statistical data published by the Department of Agriculture and 


32 M. S. Bartlett: “A note on the statistical estimation of supply and demand 
relations from time series,” Econometrica , vol. 16 (1948), pp. 232. 

33 G. Tintner: “Some applications of multivariate analysis to economic data. 
Journal of the American Statistical Association , vol. 41 (1946), pp. 486 ff. 

34 -P. H. Douglas: Theory of Wages (New York, 1934). See also H. T. Davis: 
Theory of Econometrics (Bloomington, Ind., 1941), pp. 153 ff. and the literature 
cited on p. 159. 



6.5] 


WEIGHTED REGRESSION 


135 


the Bureau of Labor Statistics. The other data have been taken with 
kind permission from an essay by L. R. Klein. 35 

The means of the data are given in Table 1. 

TABLE I 


Variable Symbol Mean 

•°g labor x 1 1 .651728 

log fixed capital X 2 2.051152 

log production X : , 1.768762 

We assume here that the disturbances are independent. Let us denote 
the error variance of the variable X t by V { . 

We have indicated above that various methods are available for esti¬ 
mating the variances of the random elements V t . We can utilize the 
variate difference method (section 11.2) for this purpose if the following 
conditions are fulfilled: 35 Each variable consists of the mathematical 
expectation or systematic part M it , which is a “smooth” function of time, 
p us the random or error part r lit . Then we can eliminate or at least 
greatly reduce the systematic component by taking differences. If differ- 
ence series of a high enough order are computed we will eventually have 
eliminated the systematic part entirely or at least sufficiently. Hence 
t IS difference series and all higher differences series will consist of the 
random part alone, or at least substantially of the random component 

In order to form an idea in which difference series this is the case we 
compute by appropriate formulae the variances of the successive difference 
series. These variances should be equal within the limits of probability 

, here 1S , no more systematic part and the series consists of the random 
element alone. 

We need estimates of the error variances of X l9 X 2 , X 3 . Table 2 gives 

TABLE 2 


Order of 
Difference 

1 

2 

3 

4 

5 


Variances of Difference Series 

lo S labor x 2 , log capital X.„ log product 
0.00026497 0.00001897 0.00082478 

0.00012103 0.00000250 0.00026453 

0.00010135 0.00000112 0.00020488 

0.00009259 0.00000080 0.00018496 

0.00007757 0.00000065 0.0001642? 


Vo;K L 1 £ 0 “™ijf;T”* mamtm " "" (New 

pp “(i7 m Tin,ner: y “"‘ m MM (Blooming,on. I„d . |» 4 0). 



136 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


the variances of various difference series of the variables. Tests (sections 
11.2.2 and 11.2.3) indicate that the second difference of X l9 the third 
difference of X 2 > and the second difference of X 3 give under our assump¬ 
tions reasonably accurate estimates of the true error variances of the 
variables in question. The error variances are for the reader's convenience 
represented in Table 3. 

TABLE 3 
Error Variances 


Variable 

log labor 
log capital 
log product 


Symbol 

X'i 

*3 


Error Variance 

0.00012103 

0.00000112 

0.00026453 


These error variances are assumed to be estimated with enough accuracy 
so that we can treat them as constants. This assumption is probably not 
justified in our case. The error covariances are assumed to be zero. 

The resulting weighted regression equation will represent the production 
function only if it is identified in a larger system of equations. This 
could be accomplished by methods similar to the ones indicated in 
Example 3. See also section 7.1. 

We want to fit the weighted regression function: 

(37) k l m l + k 2 m 2 -f k 3 m 3 + k x m x = 0 

taking into account the fundamental assumptions of our method. The 
linear equations for the coefficients are derived from the variance- 
covariance matrix of the variables (Table 4). The linear equations for 


TABLE 4 


Variance-Covariance Matrix 



X\ 


X 3 

*4 

x l 

0.00093135 

0.00000045 

0.00187085 

0.00515500" 

X 2 


0.00030905 

0.00045005 

0.06067500 

x 3 



0.00562850 

0.25020000 

*4 




38.50000000. 


the weighted regression coefficients k l9 k 2 , k 3 , and k x become, according 
to formula (17): 

0.00093135^ + 0.00000045£ 2 + 0.00187085^ + 0.00515500^ = 
(38) 0.00012103/^ 

0.00000045/r, + 0.00030905A:., + 0.00045005/c 3 + 0.06067500 At 4 = 

1 “ 0.000001 mk 2 



6 . 5 ] 


WEIGHTED REGRESSION 


137 


0.00187085 k x + 0.00045005A: 2 + 0.00562850* 3 + 0.25020000A: 4 = 
(38) 0.00026453/^3 

0.00515500^ + 0.06067500A: 2 + 0.25020000At 3 + 38.50000000 k A = 0 


The last equation has no error terms since evidently the variable X x (time) 
is free of error. 

It is apparent that this system (38) of homogeneous linear equations 
can have a non-trivial solution only if its determinant (19) becomes zero: 



0.00093135 - 0.00012103/ 
0.00000045 
0.00187085 
0.00515500 


0.00000045 

0.00030905 - 0.00000112/ 
0.00045005 
0.06067500 


0.00187085 

0.00045005 

0.00562850 - 0.00026453/ 
0.25020000 


0.00515500 

0.06067500 

0.25020000 

38.50000000 



The two smallest roots of this equation are /, = 0.5482 and A, = 22.304. 

These can be used to form test functions ( 20 ): A l = 20(0.5482)"=- 10 . 9640 . 

For large samples this is distributed like y 2 with 17 degrees of freedom. 
The y 2 permitted at the 5 per cent level is 27.587, and^at the 1 percent 
level is 33.409. Hence Aj is not significant. 

Next we compute A 2 - 20(0.5482 + 22.304) = 457.044. This is again 
distributed like y 2 with 36 degrees of freedom. For the 5 per cent level 
of significance we get a permissible y 2 of 50.714, and for the 1 per cent 
level 57.804. Our empirical A 2 is significant. Hence we conclude that 
it is unlikely that there is more than one linear relationship between the 
four variables A/„ A/ 2 , A/ 3 , and M x . We do not have multicollinearity. 

We get similar results if we use the normal approximation of T. W. 
Anderson, which has been indicated above [formula (21)]. For Aj we 

have the normal deviate: (10.964 - 2I)/V(2)(2I) = - 1.55. The proba¬ 
bility of this deviate is about 0.11, and hence is not significant at the 
1 per cent level of significance. 

We may similarly test A 2 . Here we have the normal deviate: 

[457.044— (2)(21)]/\/(4)(21) = 45.3. This is certainly significant at the 
per cent level of significance. Hence we conclude that we have probably 



138 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


one single linear relationship between the systematic parts of our variables 
in the population which corresponds to our sample. 

Inserting the smallest root = 0.5482 into the determinantal equation 

(39) , we obtain a matrix for the computation of the regression coefficients 
of our weighted regression: 

(40) m 3 = k 1 'm 1 + V w 2 + k 4 m 4 
The equations (23) to be solved are in our case: 

0.00086500V + 0.00000045V + 0.00515500V = 0.00187085 

(41) 0.00000045V + 0.00030844V + 0.06067500V = 0.00045005 
0.00515500V + 0.06067500V + 38.50000000V = 0.25020000 

The solution is given in the following weighted regression equation: 

(42) m 3 = 2.128806/Wj + 0.338665m 2 + 0.005680m 4 

This is our estimate of the production function as it presumably exists in 
the hypothetically infinite population from which our sample is taken. 

This production function gives estimates of the elasticities of production 
with respect to labor and fixed capital and also a time trend as they 
presumably exist in a hypothetical infinite population which corresponds 
to our sample. For instance, other things being equal, an increase of 
1 per cent in the total fixed capital will increase the total product by more 
than V 3 per cent. An increase in the total labor force by 1 per cent will 
increase the product by more than 2 per cent. The last term represents 
an exponential trend. It has to be interpreted in this way: Production 
increased about 3 per cent each year during the period. This last estimate 
agrees reasonably well with earlier estimates of Carl Snyder 37 and others. 

The weighted regression equation (42) given above should be compared 
with the regression equation which has been derived by classical least 
squares methods (section 5.1): 

(43) x 3 = 1.976985 a*! + 0.332328a 2 + 0.005710a 4 

This, however, is designed to predict most successfully a 3 if x l9 a 2 , and 
.v 4 are given. It does not yield the structural relation. 

Using the geometric means whose logarithms appear in Table 1, we 
can also compute the marginal productivities of capital and labor. From 
the weighted regression coefficients we get, for the marginal productivity 
per worker, $2279.63, and for the marginal productivity of the stock of 
fixed capital per dollar, $0,292. 


37 C. Snyder: Capitalism the Creator (New York, 1940). 



6 . 5 ] 


WEIGHTED REGRESSION 


139 


That is to say: Assume that conditions are on the whole the same as 
in the period considered. Other things being equal, the addition of one 
worker will result in an increase in the national product by about $2000. 
The addition of one more dollar to the stock of fixed capital will bring 
about, ceteris paribus , an increase in the net national product of almost 
$0.30. Both these estimates appear somewhat high, but maybe not 
excessively so in the light of some previous investigations in the field of 
agricultural production functions 38 (section 3.4, Example 3). These 
earlier results are, however, not strictly comparable with our production 
function which has been derived for the whole economy. They have 
been derived by different statistical procedures. 

All these results should be interpreted in the light of their statistical 
variability as described by their approximate standard errors. The matrix 

inverse to the one used in computing our weighted regression equation 
(41) is given in Table 5. 


TABLE 5 
Inverse Matrix 




*2 


*1 

' 1157.363 

41.734 

-0.221 



4700.355 

-7.413 

*4 

- 


0.038 


Using these data and the previous results, we compute the approximate 

standard errors of the weighted regression coefficients from formula (24). 

The standard error of the coefficient of m x in the weighted regression 

equation (2.128806) turns out to be 0.174; that of the coefficient of m 2 

(0.338665) appears as 0.351; and that of the coefficient of m, (0 005680) 
is 0.0009994. ' 

Using the /-test, we see that the corresponding values of t are 12.235, 

0.965, and 5.683. The / required for 17 degrees of freedom at the 5 per 

cent level of significance is 2.110 and at the 1 per cent level is 2.898. It 

turns out that the coefficients of m x and m A are highly significant, but not 
the coefficient of m 2 . 

Hence it would appear that we can perhaps with some accuracy deter¬ 
mine the elasticity of production with respect to labor, but not with respect 
to the stock of fixed business capital. A possible explanation is that the 
e lects of an increase in fixed capital may not appear in the same year 
ut in subsequent years. This possible dynamic relationship is here 

record,' ’ Ti / tner ” Brownlee: “Production functions derived from farm 

records, Journal of Farm Economics , vol. 26 (1944), pp. 566 ff. 



140 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


neglected. Methods for fitting dynamic relationships will be presented 
in section 10.3.3. The time trend can also be determined with reasonable 
accuracy. All these results are of only an approximate nature. 

We want to give fiducial or confidence limits for our estimate of the 
elasticity of production with respect to labor. Using a confidence co¬ 
efficient of 99 per cent, we get for the approximate limits of the elasticity 
2.633 and 1.625. This has to be interpreted in the following way: An 
increase in the total labor force by 1 per cent will increase the product 
probably by not more than about 2.6 per cent and not less than about 1.6 
per cent. These are pretty wide limits and emphasize the tentative nature 
of our conclusions. 

The same type of analysis can also be applied to the marginal produc¬ 
tivity of labor. Using a confidence coefficient of 99 per cent, we get 
for these approximate limits 2819.54 and 1740.13. Ceteris paribus , under 
conditions approximately the same as the ones prevailing in the period 
considered, we can make this statement: An increase of the labor force 
by one worker will probably result in an increase of the total national 
product by not more than about S2800 and not less than about $1700. 
The latter figure is probably nearer to the true value. 

We want to stress that the results for the static production function 
of the whole United States should not be taken too seriously. Our 
data are perhaps not quite adequate for the determination of such a 
function. The economic meaning of a production function representing 
all enterprises is also somewhat doubtful. It would be more desirable 
to try to fit production functions of the Douglas type to specific industries 
(see section 3.4, Examples 2 and 3). We believe, however, that the 
methods indicated above should be tried in the statistical analysis of 
such a problem. 

Using better data and also more refined statistical procedures, we may 
eventually estimate a production function which is more reliable. Know¬ 
ledge of such a production function could be of great importance for 
economic planning. It should be taken into account if the government 
contemplates changes in the total amount of labor employed (e.g., through 
a draft in wartime), if it envisages change in the total amount of capital 
(e.g., through taxation), and if it seeks other goals of economic policy. 

The above regression equation (42) can also be written: 

(44) 74.660892/V/, + 11.877565 M 2 - 35.071722 M 3 + 

0.199207A/ 4 - 85.648648 - 0 

The weighted regression coefficients are here normalized by the condition 
g uv = 1 [formula (6)]. 



6 . 5 ] 


WEIGHTED REGRESSION 


141 


We know from economic theory 39 that we have constant returns to 
scale if the product is a homogeneous function of the first degree in labor 
and capital (section 3.4). In our terminology this means that in the 
population corresponding to our sample we must have: 

(45) - (kjkj - (kjk 3 ) = 1 

The left-hand side of (45) is actually 2.467471 in our equation (44). 
Equation (45) implies a linear relationship between the weighted regression 
coefficients: 


(46) V -f k 2 * + * 3 * = 0 

We designate our new regression coefficients by k*. This is relation¬ 
ship (26) which is now imposed. We have L, = L 2 = L 3 = 1, Z. 4 = 0. 
The determinantal equation (29) is in our case: 



0.00093135 - 0.000121032* 
0.00000045 
0.00187085 
0.00515500 
1 


0.00000045 

0.00030905 - 0.00000112/* 

0.00045005 

0.06067500 

1 


0.00187085 

0.00045005 

0.00562850 - 0.00026453/* 

0.25020000 

1 


0.00515500 

0.06067500 

0.25020000 

38.50000000 

0 



The smallest root of equation (47) is /,* = 0.9692. The sum of squares 
computed from this root is A x * = 19.3840. 

If we use the / 2 -test for the equality of the two sums of squares A x * 
and Aj, we see that their difference is 8.420. This is in the limit distri¬ 
buted like y 2 with I degree of freedom [formula (31)]. But at the 5 per 
cent level of significance the permissible value of y 2 is only 3.841. Hence 
the hypothesis of equality has to be rejected. 


39 G. Stigler: Production and Distribution Theories (New York, *1941), pp. 

320 ff. G. Tintner: “A test for linear relations between weighted regression 

coefficients, Journal of the Royal Statistical Society , series B, vol 42 (1950) 
pp. 273 ff. ' ' >y 







142 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


We test the null hypothesis that the difference between the sums of 
squares and A x * has arisen by chance also by the variance ratio test. 
The test is based upon the 5 per cent level of significance. The analysis 
of variance is summarized in Table 6. 

TABLE 6 

Analysis of Variance 

Source of Variation Degrees of Freedom Sums of Squares Mean Square 

A x 17 10.9640 0.6449 

Ai* - A x 1 8.420 8.4200 

The variance ratio is F' = 13.056 [formula (32)]. This ratio has 
approximately the F-distribution with 1 and 17 degrees of freedom. At 
the 5 per cent level of significance F may be as large as 4.45. Our em¬ 
pirical F' is much larger. Hence the null hypothesis has to be rejected. 
It is not likely that relation (45) holds in the population corresponding 
to the sample. 

We conclude from this that in all probability there are not constant 
returns to scale, but increasing returns to scale in the American economy 
during the period considered. This is in contradiction to earlier results 
established by Paul Douglas and his collaborators for the American 
economy. 40 It contradicts also a previous result of the author regarding 
a production function for Iowa farms 41 (section 5.3, Example 1). The 
variance ratio was significant at the 5 per cent level but not at the 1 per 
cent level. 

A tentative explanation can perhaps be given in the following way: 
The studies of Douglas and his collaborators refer to the manufacturing 
section of the American economy only. Our own study refers to a 
production function in agriculture. It is conceivable that the production 
functions for industry and agriculture separately are homogeneous 


40 C. W. Cobb and P. H. Douglas: “A theory of production,” American 
Economic Review , vol. 18 (1928), supplement, pp. 139 ff. P. H. Douglas and 
M. Bronfenbrenner: “Cross section studies in the Cobb-Douglas function, 
Journal of Political Economy , vol. 47 (1939), pp. 761 ff. P. H. Douglas and 
G. T. Gunn: “The production function for American manufacturing for 1914, 
ibid ., vol. 50 (1942), pp. 595 ff. P. H. Douglas, P. Daly, and E. Olson: “The 
production function for manufacturing in the United States, 1904, ibid., vol. 
51 (1943), pp. 61 ff. See also J. Marschak and W. H. Andrews: op. cit ., pp. 
143 ff. 

41 G. Tintner: “A note on the derivation of production functions from farm 
records,” Econometrica , vol. 12 (1944), pp. 2^ ff. 



6 . 5 ] 


WEIGHTED REGRESSION 


143 


functions of degree 1, but not a production function for the total American 
economy. 

Example 6. Another example is the fitting of a static demand and 
supply function for labor in England. 42 

Our very simplified model of the industrial labor market consists of 
a demand function and a supply function. Since we will deal with the 
inter-war period of the English economy (1920-38) we do not need to 
assume equilibrium on the labor market. 

The equations indicated have been derived from the assumptions of 
free competition. If we have various forms of imperfect competition, 
then the elasticities of demand and supply will also enter into these 
equations in a simple manner. 43 

Our main interest is the problem of homogeneity. The author has 
shown 44 that under a great variety of assumptions regarding market 
organizations the demand and supply functions of the factors of produc¬ 
tion (among them labor) will be homogeneous of zero degree in the prices. 
They depend upon the price ratios rather than upon the absolute prices. 

In the article quoted it is shown that the homogeneity property is 
preserved under the following types of market organization: free com¬ 
petition, monopoly, monopsony, monopolistic and monopsonistic com¬ 
petition, selling cost, price discrimination, product differentiation, spatial 
competition. It is only certain forms of oligopoly and oligopsony and 
certain cases of bilateral monopoly which are excluded. 

A number of articles published recently 45 stress the problem of 

42 The author is much obliged to Professors J. Tinbergen, L. R. Klein. P. M. 
Samuelson, and R. Stone for advice and criticism. I am also obliged to Mrs. 
N. Smithies for help with the computations and to Mr. A. A. Adams for help 
with the collection of the data. 

43 J. Robinson: The Economics of Imperfect Competition (London, 1938). 

G. Tintner: “Homogeneous systems in mathematical economics,” Econo¬ 
metric a, vol. 16 (1948), pp. 273 flf. 

J. M. Keynes: The General Theory of Employment , Interest and Money 
(London, 1936). L. R. Klein: The Keynesian Revolution (New York, 1947), 
pp. 194 ff. J. R. Hicks: Value and Capital (Oxford, 1939), pp. 254 fT. O. 
Lange: Price Flexibility and Employment (Bloomington, Ind., 1944), pp. 99 ff 
J. L. Mosak: General Equilibrium Analysis in International Trade (Bloomington 
Ind., 1944), pp. 99 flf. W. W. Leontief: “The fundamental assumption of Mr’ 
Keynes’ monetary theory of unemployment,” Quarterly Journal of Economics , 

. ^ (1936), pp. 192 flf.; “Postulates: Keynes’ general theory and the classi¬ 

cists, in S. E. Harris, ed.: The New Economics (New York, 1947), pp. 232 AT 
D. Patinkin: “Relative prices, Say’s law and the demand for money,” Econo- 
metnea , vol. 16 (1948), pp. 135 flf.; “The indeterminacy of absolute prices in 



144 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.S 


homogeneity. If the demand and supply functions of all commodities are 
homogeneous of zero degree, then they depend upon the price ratios 
rather than upon the absolute values of the prices. Since Keynes it has 
been stressed by various theoretical economists that in the case of homo¬ 
geneity general market equilibrium is possible without the full employment 
of all factors of production. 

One suggestion which has been made by Keynes and many post- 
Keynesian writers 46 is as follows: If at least one of the demand or supply 
functions is not homogeneous in the prices, then this difficulty can be 
avoided. It has more particularly been suggested that the static supply 
function of labor may not be homogeneous. 

For various reasons the supply of labor may depend upon money wages 
rathei than upon real wages. This phenomenon may be due to many 
causes money illusion, 4 ' past contracts, 48 non-rational behavior, etc. 

The relation between real and money wages has been discussed exten¬ 
sively during recent years. 4,1 But it is interesting to note that the question 
of homogeneity has not received attention in connection with these studies. 
In fact it is not easy to see the relevance of these empirical studies for 


classical economic theory," ibid. , vol. 17 (1949), pp. 1 ffi G. Tintner: "Homo¬ 
geneous systems in mathematical economics," ibid., vol. 16 (1948), pp. 273 ff. 
W. B. Hickman: "The determinacy of absolute prices in classical economics," 
ibid., vol. 18 (1950), pp. 9 ff. W. W. Leontief: "The consistency of the classical 
theory of money and prices," ibid., pp. 21 ff. C. G. Phipps: "A note on 
Patinkin's ‘relative prices,' " ibid. , pp. 25 ff. 

46 F. Modigliani: “Liquidity preference and the theory of money," Econo - 
metrica , vol. 12 (1944), pp. 45 ff. D. M. Fort: "A theory of general short run 
equilibrium," ibid., vol. 13 (1945), pp. 293 ff. 

4 ' J. Marschak: “Money illusion and demand analysis," Review of Economic 
Statistics , vol. 25 (1943), pp. 40 ff.; “The rationale of the demand for money 
and of money illusion," Metroeconomica, vol. 2 (1950), pp. 71 ff. 

48 J. R. Hicks: op. cit ., pp. 264 ff. 

4!l J. M. Keynes: op. cit., pp. 5 ff., pp. 257 ff. A. C. Pigou: "Real and 
money wage rates in relation to unemployment," Economic Journal , vol. 47 
(1937), pp. 405 ff. L. Tarshis: "Real wages in the United States and Great 
Britain," Canadian Journal of Economics and Political Science, vol. 4 (1938), 
pp. 362 ff.; “Changes in real and money wages," Economic Journal, vol. 49 
(1939), pp. 150 ff. J. T. Dunlop: "The movement of real and money wages," 
ibid., vol. 48 (1938), pp. .413 ff. J. M. Keynes: "Relative movements of real 
wages and output," ibid., vol. 49 (1939), pp. 34 ff. W. Fellner: Monetary 
Policies and Full Employment (Berkeley, 1946), pp. 94 ff., 109 ff. B. F. Haley: 
"Value and distribution," in H. S. Ellis, ed.: A Survey of Contemporary 
Economics (Philadelphia, 1948), pp. 36 ff. 



6 . 5 ] 


WEIGHTED REGRESSION 


145 


the theoretical problem raised above. The reason for this is probably 

the extreme skepticism of Keynes towards econometric investigations, 50 

which was also shared to a certain extent by the other participants in the 
discussion. 

We propose to attack the problem of homogeneity of the demand and 
supply function of labor from an econometric point of view. We want 
to indicate at least a partial and tentative answer to the questions: Are 
the demand and supply functions of industrial labor in England homo¬ 
geneous of zero degree? Do they depend upon real wages rather than 
upon money wages? 

The following series have been used in carrying out the statistical work 
which will be described below. 


A. The Ministry oj Labour Cost-oJ-Living Index (Base July 1914 = 100) 

This index is a weighted average of retail prices, the composition and 

weighting being based on an average working-class family budget for 

1914 and the prices being obtained mostly by monthly reports from a 

large number of retailers. The figures used are taken from the Statistical 

Abstract of the United Kingdom. In what follows the index is indicated 
by K. 


B. The Board oj Trade Wholesale Trice Index 

This is a geometric mean of wholesale price series, mostly market 
quotations. The number of series used for each commodity izroup is 
proportional to the value of the commodities manufactured, as shown 
by the Censuses of Production, and the value imported for direct con¬ 
sumption. The index was extensively revised in 1934. The earlier index 
was based on the 1907 census and used about 180 quotations covering 
150 commodities; the revision was based on the 1930 census and 260 
quotations were used, covering 200 commodities. Raw' materials and semi¬ 
manufactured and manufactured goods are represented. 


Weighting Scheme 

1920-34 

Index 

1930-38 

Index 

Food 

53 

68 

Coal 

10 

9 

Metals 

32 

45 

Textiles 

31 

30 

Chemicals and oils 

9 

15 

Miscellaneous 

15 

33 


150 

200 


( 1939 , pp " Pr ° feSSOrTinber S en ' s method ." Economic Journal, vol. 49, 



146 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


For this investigation the two series have been grafted by using the 
years of overlap 1930-34. The value in 1930 has been taken as 100, and 
the source is the Statistical Abstract of the United Kingdom. This series 
is indicated by P. 

C. Wage Index 

The wage index is a weighted average of certain wage data for June 
each year. The wages in the following industries have been included: 
coal mining; other mining and quarrying; iron and steel trades; engin¬ 
eering; shipbuilding and metal trades; non-ferrous trades; textile trades; 
leather trades; clothing trades; food, drink, and tobacco trades; chemical 
trades; paper, printing, and stationery trades; timber trades; clay and 
building material trades; miscellaneous manufacturing; and building 
and contracting. 

The weights are approximately proportional to the aggregate weekly 
full-time wages in each industry in 1924. The latter have been estimated 
as far as possible from the available data regarding numbers employed 
and wage earnings. The base year of the index is the average of June 
and December 1924 = 100. The source is the Journal of the Royal 
Statistical Society, 1935, Part 4, p. 639; 1938, Part 1, p. 202; 1938, Part 2, 
p. 289. Unpublished figures have also been used. This series is indicated 
by W. 

D. Number of Employed 

These figures have been taken from a forthcoming publication by A. L. 
Chapman entitled Wages and Salaries in the United Kingdom , 1920-1938. 
These figures are then expressed in ratio to the gross labor force. 

The gross labor force was calculated from figures for the total popu¬ 
lation of the United Kingdom, analyzed by age groups and sex, given in 
the forthcoming Consumers' Expenditure in the United Kingdom , 1920- 
1938 , by J. R. N. Stone. From the total population the following deduc¬ 
tions have been made: To allow for infants and children at school, all 
children in age groups up to 9 and four-fifths of the age group 10-14 
were deducted; less than four-fifths of the 10-14 group was deducted in 
1920-22, to allow for the gradual raising of the school-leaving age. To 
allow for people too old to work, fractions of age groups 55-64 (women) 
and 65 and over (men and women) were deducted; the fraction of each 
age group to be deducted in each year was estimated from the proportions 
of those age groups occupied and unoccupied according to the Censuses 
of Population, with a linear trend between the censuses. 

The coverage is approximately the same as for the above series 'W. 
This series is denoted by D. 



6 . 5 ] 


WEIGHTED REGRESSION 


147 


E. The Sum of Employed and Unemployed 

The figures for the unemployed are averages of 12 months’ series of 
insured unemployed. This series is added to the number of employed 
and again divided by the gross labor force. The ratio is denoted by E. 

We present in Table 7 the means of the logarithms of the various series. 
Time is measured with 1929 as origin. 


TABLE 7 



Means 

log K 

2.219 

log W 

2.011 

log E 

2.405 

log P 

2.063 

log D 

2.333 


The method of weighted regression has been used in order to estimate 
the elasticities. The error variances are estimated from the third differ¬ 
ences of the data. We have also investigated the variables which enter 
into each equation for multicollinearity. Equations with time trend and 
without time trend are given, also least squares approximations. The 

problem of homogeneity is investigated by the methods given above. 
The results are presented in Tables 8, 9, and 10. 

Tables 8 and 9 give estimates of the price and wage elasticities under 
various assumptions for the demand and the supply function for labor. 
The 95 per cent confidence limits are also given. 

Table 10 presents the sum of squares of the deviations and the test 
function for homogeneity (F). 

The results are somewhat inconclusive, especially as far as the supply 

functions are concerned. Table 10 indicates homogeneity of the demand 
and supply functions of labor. 


lAbLt 8 


Estimates, Tests of Significance, Confidence Limits 


Regression 

Estimate 

Test of 

95 Per Cent 

Coefficient 

Signifi¬ 

Confidence 


Demand with Time Trend, 

cance (i) 

Limits 

Price elasticity 
w age elasticity 
Time trend 

Ordinary Regression 



0.588481 
- 0.413997 

0.005312 

0.734 

0.318 

0.401 

2.299928 - LI22966 
- 3.190298 2.362304 

C.028862 - 0.018238 



148 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[6.5 


Price elasticity 
Wage elasticity 
Time trend 


Demand with Time Trend, 
Weighted Regression 

0.682176 
- 0.577298 
0.006022 


6.437f 0.908002 0.456350 

3.2411 - 0.956831 - 0.197765 
4.259f 0.009035 0.003009 


Price elasticity 
Wage elasticity 


Demand without Time Trend, 
Ordinary Regression 

0.325151 0.667 

- 0.252278 0.230 


1.358962 - 0.708661 
- 2.578586 2.074030 


Price elasticity 
Wage elasticity 


Demand without Time Trend, 
Weighted Regression 

0.389551 11.305+ 

- 0.398335 5.011 + 


0.456663 0.312439 

- 0.566858 - 0.229812 


Homogeneous Demand 
with Time Trend 

Real wage elasticity — 0.730658 

Time trend 0.005965 


6.171+ 0.982981 0.478335 

0.819 0.021481 - 0.009551 


Homogeneous Demand 
without Time Trend 

Real wage elasticity — 0.390335 


* Significant at the 5 per cent level, 
t Significant at the 1 per cent level. 


5.355+ 0.544198 0.236522 


TABLE 9 

Estimates, Tests of Significance, Confidence Limits 


Regression 

Coefficient 


Price elasticity 
Wage elasticity 
Time trend 


Price -elasticity 
Wage elasticity 
Time trend 



Test of 

95 Per Cent 

Estimate 

Signifi¬ 
cance (/) 

Confidence 

Limits 

Supply with Time Trend, 

Ordinary Regression 

0.143248 

1.458 

0.352559 - 0.066063 

0.068538 

0.797 

0.251804 - 0.114728 

0.000913 

1.582 

0.002143 - 0.000317 

Supply with Time Trend, 

Weighted Regression 

0.156347 

2.410* 

0.294600 0.018094 

0.067050 

1.171 

0.189086 - 0.054986 

0.001032 

4.607+ 

0.001510 0.000554 



6 . 5 ] 


WEIGHTED REGRESSION 


149 


Supply without Time Trend, 
Ordinary Regression 


Price elasticity 

Wage elasticity 

0.005764 

0.004334 

0.111 

2.537* 

0.116184 

0.306573 

- 0.104656 
0.027441 

Price elasticity 

Wage elasticity 

Supply without Time Trend, 
Weighted Regression 

- 0.199840 

0.438445 

1.675 

2.867* 

- 0.452735 
0.762604 

0.053055 
0.1 14286 


Homogeneous Supply 
with Time Trend 




Real wage elasticity 
Time trend 

- 0.296233 

0.000314 

2.131* 

0.021 

- 0.590962 

0.032695 

- 0.001504 

- 0.03206n 


Homogeneous Supply 
without Time Trend 




Real wage elasticity 

- 0.192039 

4.063t 

- 0.291757 

- 0.097519 


* Significant at the 5 per cent level, 
t Significant at the 1 per cent level. 

TABLE 10 


Analysis 

of Variance Tests 



Weighted Sum of Squares 



Without 

With 

Variance 

Restrictions 

Restrictions 

Ratio ( F) 

Demand with time trend 

17.964 

21.042 

2.569 

Demand without time trend 

27.540 

40.176 

7.800* 

Supply with time trend 

28.962 

29.430 

4 126 

Supply without time trend 

28.134 

29.556 

1.236 

There ought to be a warning about the 

interpretation 

of our results. 

i ne estimates and probabilities 
• 

given are valid only under the followina 

assumptions: 





(a) The theoretical framework must be valid. There must be a static 
demand and supply function in the unknown hypothetical infinite popu¬ 
lation which are at least approximately linear in the logarithms. 

(n) There must be no errors in the equations. 

(e) The errors in the variables must have constant variance over time. 

U ) They must be independent from series to series. The individual 
items of each error series must also be independent. 

(e) The error variance must be known. 

(J) The errors in the variables must be normally distributed 
(#) The sample must be large. 



150 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


The reader can appreciate that these conditions are only partially 
fulfilled in our case. Hence no too great importance ought to be given 
to the probabilities established and the estimates calculated. The prob¬ 
abilities may be regarded as a kind of limit which would hold if we had 

ideal data and worked under ideal conditions. But this is assuredly not 
the case. 

Even under ideal conditions we would not be able to use the existing 

methods of statistical inference for the problem of induction. The 

probabilities are still related to relative frequencies. What is desired are 

probabilities which express the degree of confirmation of a hypothesis 

based upon a certain empirical evidence. As indicated in section 1.2 

such a system of inductive inference has been developed by Carnap. It 

is unfortunate that his methods are not yet applicable to econometric 

problems. His theory is, for instance, not able to deal with continuous 
variables. 

As far as practical results are concerned, it seems that there may be 
a little more to be said for our contribution to an analysis of the demand 
side of labor than of the supply side. It is very likely that the interwar 
demand function of labor in industry in Great Britain was homogeneous 
in wages and prices. It is probable that the demand for labor depended 
upon real wages rather than upon money wages. 

With due caution we may say that our best guess of the magnitude of 

the response of industry to an increase of 1 per cent in real wages is as 
follows: 

There should be a decrease in the demand for labor of not less than 
about V 5 of 1 per cent and not more than about 1 per cent. These results 
are, of course, very tentative and approximate. Still, they may give a 
certain guidance to policy. In certain situations it may be of some im¬ 
portance to note that it is very unlikely that an increase in real wages 
would bring about no response at all (elasticity zero). It is also very 
improbable that an increase of I per cent in real wages would be followed 
by a decrease in demand for labor in industry of as much as, say, 10 
per cent. 

These conclusions are, of course, not very helpful. They are based 
upon the experience of the interwar period. The probabilities indicated 
are valid only under very restrictive assumptions. For instance, condi¬ 
tions must be approximately the same as during the years analyzed. 

But one may reason by analogy and say that the conclusions indicated 
may conceivably hold approximately also for the present period and for 
the near future. This does not guarantee, of course, that they will also 
hold in the more distant future if some fundamental changes should take 
place in British industry which could change its character compared to 



6 . 5 ] 


WEIGHTED REGRESSION 


151 


the interwar period. Assume, for instance, that a cheap new source of 

power becomes available in the future, such as atomic energy. This 

would, of course, change completely the relationship between various 

factors ot production and would make all practical applications of our 
results valueless. 

As far as the supply function for labor in British industry is concerned 
we are on ground which is even less firm than that on which we placed 
ourselves in discussing the demand for labor. It appears from our 
analysis that it is somewhat doubtful whether wages and prices exert any 
influence at all upon the supply of labor in industry. If they exert an 
influence it seems again likely that the supply function is homogeneous 
tor wages and prices. This is to say that the supply function of labor in 
industry appears to depend upon real wages rather than upon money 
wages, in so far as it depends upon wages at all. 

Our results indicate that the supply of labor in industry may be totally 

irresponsive to a change in real wages (elasticity zero), a conclusion which 

cannot entirely be ruled out. On the other hand, it is very unlikely that 

the supply of labor increases with an increase in real wages. It seems 
rather to have a tendency to decrease. 

It is not very likely that our conclusions about the labor supply function 
m British industry have very much value for future economic policy 
Conditions of the labor supply have probably been changed fundamentally 
through the war experience. A whole group of people who had not 
een employed before have been drawn into the industrial labor market 
his must have had very profound repercussions for the future supply of 
labor in industry which cannot possibly be deduced from our analysis 

The same is true about the long period of full employment, which may 
have affected the supply of labor profoundly. y 

As far as any theoretical conclusions from our analysis are concerned 
e^have to repen, again the caution which we have stressed a number of 
times. | he pure economic theorist may feel that his faith in his 

evTet a elllnrTe al d trUCtU T ann0t ^ ShakCn by ^ eV,dence ’ how - 

ver excellent. We do not, of course, claim any particular excellence 

it cm be V c| enCe t S n OUld by neCCSS, ‘y bc s °mewhat doubtful whether 
U can be clam,el a, all „ evidence for ,hc particular theories we have 

physL^dtmlogy.' 10 ^" e “we | S "”' lar in nature to 

throw some light upon theoretical p „‘bte '"'“'gattons >ho.„d 

les^subTeeMo c'rifie h b0, t' *"•>»* >’ f *mmt for labor is 

selves haooilv i„ ' m than the analysis of supply. Here we lind our- 
‘PP ) agreement with the theoretical expectations. The 



152 


APPLICATIONS OF MULTIVARIATE ANALYSIS 


[ 6.5 


demand function of labor in British industry is probably a homogeneous 
function of zero degree in prices and wages. This is to say that, in all 
likelihood, the demand for labor in British industry depended upon real 
wages rather than upon money wages. This agrees, of course, with the 
position which has been taken by many economists. 51 It conforms, for 
instance, to the marginal productivity theory. 52 The homogeneity of 
the demand for factors of production holds not only under free competi¬ 
tion. 53 If there are elements of imperfect competition in the situation 
our findings agree also with the theory. For many forms of imperfect 
competition the static demand functions for the factors of production 

are homogeneous of zero degree in the prices. 

The evidence regarding the supply function of labor is less reliable. 
It is interesting to note that, in so far as we have any evidence at all, it 
seems rather to support the classical position 54 than the Keynesian theories. 
In so far as wages and prices take any part at all in influencing the supply 
of labor in industry, it seems that the supply function of labor is homo¬ 
geneous of zero degree in prices and wages. Hence we must consider 
this as some evidence, however feeble, against Keynes and his followers 
who have stressed the fact that the supply of labor is a function of money 
wages rather than of real wages. 

It is interesting to note that according to our results it is by no means 
impossible that the supply function of labor in industry should have 
zero elasticity—this is to say, should be completely independent of real 
wages. 55 On the other hand, it seems somewhat more likely that the 
real wage elasticity is negative than that it is positive. This is to say that 
an increase in real wages may have a tendency to bring about a decrease 
in the supply of labor for industry rather than an increase. 

This is again in agreement with some ideas put forward by several 
economic theorists. 56 It is not a very surprising result for England in the 


51 P. A. Samuelson: Foundations of Economic Analysis (Cambridge^ Mass., 
1947), pp. 68 ff., 83 ff. 

52 B. F. Haley: op. cit ., pp. 26 ff. . , 

53 G. Tintner: “Homogeneous systems in mathematical economics, tcono- 

metrica , vol. 16 (1948), pp. 273 ff. . . „ . 

54 W. W. Leontief: “Postulates: Keynes' general theory and the classicists, 

S. E. Harris, ed.: The New Economics (New York, 1947), pp. 232 ff. 

55 J. Marschak: “Wages," in the Encyclopaedia of the Social Sciences , vo . 

15 (New York, 1935), pp. 291 ff. . r , 1Q74 v 

56 A L. Bowley: The Mathematical Groundwork of Economics (Oxford, h 
pp. 40 ff. J. R. Hicks: op. ci pp. 36 ff U. Ricci: “Die Arbeit in der Ind.v,- 

duaiwirtschaft," Die Wirtschaftstheorie der Gegenwart , vol. 3 (Vienna > 

pp. 113 ff. E. H. Schoenberg and P. M. Douglas: “Studies in the supp y 



6 . 5 ] 


WEIGHTED REGRESSION 


153 


interwar period. Workers in industry, if they were employed, had 
reached a reasonably high standard of living, and the choice between 
earnings and leisure may have become important. This conclusion, 
however, should not be taken too seriously because of the considerable 
unreliability of our results. 

We want to emphasize again that we do not want to stress too much 
the importance (practical or theoretical) of our results. It would be best 
to consider our efforts an exercise in methodology. It is to be hoped 
that further investigations in this matter wall be undertaken by econo¬ 
metricians who will be able to make more detailed and hence more 
valuable studies. 


of labor. The relation in 1929 between average earnings in America and the 

proportion seeking employment," Journal of Political Economy , vol 45 (1937) 
pp. 146 ff. 



Chapter 7 



Stochastic Models with Errors in the 
Equations 


We will here deal with the methods developed by the members of the 
Cowles Commission at the University of Chicago and their collaborators. 
The assumption is always that there are errors in the equations but none 
in the variables. The problem of errors in the variables has been dis¬ 
cussed in the last section of the previous chapter, section 6.5. Errors 
of observation, etc., are assumed to be absent. We are not yet able to 

analyze static systems which have both errors in the equations and errors 
in the variables. 

Estimation methods are available which deal with both errors in the 
equations and errors in the variables if we have specific dynamic models. 1 
But these cases are rather exceptional. The problem arises of what to 

do in cases in which we have, for instance, static models and have to 
choose between the two methods. 

No general rule can be given for this decision. If we believe that no 
important variables have been left out of our system of equations, we 
may neglect the errors in the equations and deal only with the errors in 
the variables. This will particularly be the case if we have reason to 
believe that our data are subject to large errors of observations, that they 
represent the microeconomic quantities rather poorly, etc. 

Assume, on the other hand, that we have a situation where 'we are 
prepared to trust our data. There are no large errors in the variables. 
But there are certain variables which ought to appear in our system but 
have been left out. These variables give rise to rather large errors in 
the equations. In such a case we will use the methods of the present 
chapter. 

Complications which arise in connection with the specific time series 
problems (serial correlation, autocorrelation, stochastic difference equa¬ 
tions, etc.) will be treated in Part 3. 


1 L. Hurwicz: “Trend and seasonality," in T. Koopmans, ed.: Statistical 
Inference in Dynamic Economic Models (New York, 1950), pp. 338 ff. 



7 . 1 ] 


IDENTIFICATION 


155 


We will present in this chapter the following problems: (a) identification 
(section 7.1); (b) estimation of equations which are just identified (section 
7.2); (c) estimation of single over-identified equations (section 7.3). We 
will again stress estimation methods. Our discussion will be in terms of 
the sample, for the sake of simplicity. Actually it is the purpose of 
estimation methods to estimate the corresponding parameters in a hypo¬ 
thetical infinite population which corresponds to our sample. Tests of 
significance, etc., will, however, be indicated in the cases for which they 
have been established. 

7.1 Identification 2 

We deal here only with linear structures. All variables are assumed 
to be observed without errors (section 6.5). Hence, for instance, errors 
of observations are absent. But on the other hand there are still errors 
in the equations. These errors are the result of the variables which 
should have been included in the system but which have been omitted. 
These variables may have been omitted because their existence is not 
suspected by the investigator, because we have no observations for them, 
or for other reasons. It is assumed that they give rise to disturbances 
which can be represented by certain random variables. These variables 
are assumed to be normally distributed. 

J 

Our model consists of G equations. There are two types of variables 
in the system. * First, we have G jointly dependent or endogenous variables 

2 R. Frisch: Pitfalls in the Statistical Construction of Demand and Supply 
Curves (Leipzig, 1933). E. J. Working: “What do statistical demand curves 
show?" Quarterly Journal of Economics, vol. 41 (1927), pp. 212 fF T. Haavelmo: 
"The statistical implications of a system of simultaneous equations," Econo - 
metrica, vol. II (1943), pp. 1 fF; “The probability approach in econometrics," 
ibid., vol. 12 (1944), supplement. L. R. Klein: “Pitfalls in the statistical 
determination of the investment schedule," ibid. , vol. II (1943), pp. 246 fF. 

T. C. Koopmans: “Statistical estimation of simultaneous economic relations," 
Journal of the American Statistical Association , vol. 40 (1945), pp. 448 ff. 

J. Marschak: “Economic interdependence and statistical analysis," Studies in 
Mathematical Economics and Econometrics (Chicago, 1942), pp. 135 ff J 

Marschak and W. H. Andrews: "Random simultaneous equations and the 
theory of production," Econometrica , vol. 12 (1944), pp. 143 fF T. C. Koop¬ 
mans and O. Reiersol: "The identification of structural characteristics," Annals 
of Mathematical Statistics, vol. 21 (1950), pp. 165 ff. 

T.C. Koopmans: "Identificationproblems in economic model construction " 
Econometrica , vol. 17 (1949), pp. 125 ff. T. C. Koopmans, H. Rubin, and R. B. 
Leipmk: “Measuring the equations system in dynamic economics," in T. C. 
Koopmans, ed.: Statistical Inference in Dynamic Economic Models (New York, 



156 


ERRORS IN THE EQUATIONS 


[ 7.1 


These are the variables which are actually determined by the system of 
structural equations which we are investigating. We denote them by 
Y x , Y 2 , • * • Y g . In econometric work they are variables like prices, 
quantities bought and sold, and interest rates. 

Another type of variables is the exogenous variables or predetermined 
variables. There are K exogenous variables, Z 1# Z 2 , • • • Z K . These 
variables are, for instance, time, the weather, population, but also past 
values of the endogenous variables, e.g., Y lt ^. They are here simply 
treated as fixed variates. Actually they give rise to stochastic difference 
equations. These will be treated in section 10.3. 

Which variables are endogenous and which are exogenous will depend 
upon the theoretical assumptions underlying the economic model. In 
a short-run model we will, for instance, treat fixed capital as an exogenous 
variable. But fixed capital will be an endogenous variable in a long-run 
economic model. 

Certain logical and philosophical difficulties arise in connection with 
model construction and the choice between various models on the basis 
of the empirical data. There are as yet no valid methods of statistical 
inference which would enable us to deal with problems of this nature. 
Some of these difficulties have been discussed in section 1.2. 

Our system of structural equations (model) is now: 

O K 

(1) 6,0 + i b fi Yj 4- 2 c„Zj = e t . (i = 1, 2, • • • G) 

j=i j— i 

The random variables e t are the errors in the equations. They result 
from the fact that certain variables have not been included in the respective 
equations. 

Our purpose is the estimation of the structural constants or coefficients 
bj o, b if (/, j = 1, 2, * • • G) and c ti (/ = 1, 2, • • • G; j = 1, 2, • * K). 

The equations in system (1) can be identified if some of the constants b iS 
and c u are zero. They must be identified if we want to distinguish 
various economically significant relations like demand functions, supply 
functions, the consumption function. This is of importance for certain 
applications to policy, as Haavelmo has shown. 4 We consider here only 


1950), pp. 53 ff. A. Wald: “Note on the identification of economic relations,^ 
ibid., pp. 238 ff. L. Hurwicz: “Generalisation of the concept of identification, 
ibid ., pp. 245 ff. G. Tintner: “Die Identifikation: ein Problem der Oekono- 

metrie,” Statistische Vierteljahrsschrift, vol. 3 (1950), pp. 7 ff. 

4 T. Haavelmo: “The probability approach in econometrics, ’ Econometrica , 

vol. 12 (1944), supplement. 



7 . 1 ] 


IDENTIFICATION 


157 


identification by the condition that certain variables are absent in some 
specific equations of system (1). 

.The rules for identification are as follows: A necessary condition for 
identification of a given equation in the structural system ( 1 ) (model) is: 
The number of variables excluded from this equation must be at least 
G — 1, i.e., 1 less than the total number of structural equations (and also 
of endogenous variables) in the whole system. A necessary and sufficient 
condition is that we can form at least one non-vanishing determinant of 
order G— 1 (Appendix A. 1.2) from the coefficients with which the vari¬ 
ables excluded from our equation appear in the G — 1 other structural 
equations. 

In what follows we will use deviations from the means of our variables, 
which will be denoted by y n z,. The constants h t() are given by the 
condition that the best fit goes through the means of all variables. 

Example 1. Let us consider a system of a linear static demand and a 
linear static supply function. Denote the price by v, and the quantity 
by y 2 (G = 2). Let the demand function be: 

(2) b n y x + V >'2 = 

Here e l is the disturbance or error in equation (2). It is a random variable 
which represents all the variables which actually influence the demand 
relationship but are not included in (2). Such variables are, for instance, 
income, the income distribution, prices of related commodities, general 
conditions of business. 

The supply equation is: 

(3) ^2lf l "b ^22.^2 ~ ^2 

The random variable e 2 is the error in equation (3). It represents the 
variables which influence the supply relationship but are absent from 
equation (3). Such variables are, for instance, costs of various factors 
of production, stocks, wages, technological conditions of production. 

We have two equations in our system or model and also two jointly 
determined variables (G = 2). But there are no exogenous variables 
(K = 0 ). 

Consider first the identification of equation (2). In order to have it 
identified there should be at least one variable (G — 1 —- 1) which appears 
in (3) but not in (2). But both variables y\ and y 2 appear in both equa¬ 
tions. Hence equation (2) is not identified. 

Next consider equation (3). In order for this equation to be identified 
there should be at least G — 1 = 1 variable which appears in equation (2) 
but not in equation (3). But this is clearly not the case. Hence equation 
(3) is also not identified. 



158 


ERRORS IN THE EQUATIONS 


[ 7.1 


That equations (2) and (3) are not identified can also be seen in the 
following way: Let us assume we multiply equation (2) with a constant 
K x and equation (3) with a constant K 2 . We form by addition the 
“pseudo-demand function”: 

(4) (^1^11 4“ ^2^2l).Vl (^1^12 4" ^ 2 ^ 22 ) 3*2 = ^l £ l 4“ 

Next we multiply equation (2) by a constant L x and equation (3) by a 
constant L 2 and add. In this way we have the “pseudo-supply function”: 

(5) {L l b u + L 2 b 2 \))\ -f - 4* 7. 2 b 22 )y 2 = L 1 e l -f- L 2 e 2 

Now it will be apparent that the “pseudo-demand function” (4) has 
the same appearance as the true demand function (2) The same variables 
enter into both. But on the basis of our assumptions and the knowledge 
of the observed variables y\ and y 2 we cannot distinguish between ( 2 ) 
and (4). Hence the equation is not identified. 

Similarly, the “pseudo-supply function” (5) and the supply function 
(3) have the same appearance. Our observations of y\ and y 2 do not 
permit us to distinguish between (3) and (5). Hence the equation is not 
identified. 

It is interesting to note that our system of equations could have been 
identified by the condition that e x and e 2 are not correlated. If instead 
of the demand function (2) and the supply function (3) we use the system 
of the “pseudo-functions” (4) and (5), we see that the error terms in (4) 
and (5), namely, K 1 e l -|- K 2 s 2 and L 1 e l 4- L 2 e 2 , are uncorrelated only if 
Ko 0 and L x = 0. Hence under this assumption both equations are 
identified. 

Example 2. We consider a system similar to that in Example 1. But 
let us introduce now a new variable, say income z l (an exogenous variable), 
which appears in the demand equation but not in the supply equation. 
Hence our demand function is now: 

( 6 ) ^nTi 4- b l2 y 2 -f- c ix z x = e L 

The supply equation is as before: 

O) b 2 1 Vi 4- b 2 ,y 2 =-■ e 2 

Consider first equation ( 6 ) with respect to identification. We have 
again G — 2. Hence there should be at least G— 1 = 1 variables which 
appear in equation (7) but not in equation ( 6 ). This is clearly not the 
case: The two variables y\ and v 2 which appear in (7) are also in equation 

( 6 ). Hence the demand equation ( 6 ) is not identified. 

But now consider the supply equation (7). There must again be at 
least G — 1 = 1 variable which appears in the other equation ( 6 ) but 



7.1] 


IDENTIFICATION 


159 


is excluded from (7). If this is the case the necessary condition for 
identification is fulfilled. There is indeed one such variable, namely z x . 
Hence the equation is possibly identified. 

For the necessary and sufficient condition of identification we require 
that we can form at least one non-zero determinant of rank G — I - I 
from the coefficients of the variables which appear in (6) but not in (7). 
But the only variable which satisfies this condition is z x . Hence our 
condition is that c n 0. That is to say: Equation (7) is identified 
only if the variable z x actually enters with a non-zero coefficient c xx into 
equation (6). 

Let us now again multiply the demand function (6) by an arbitrary 
constant K x and the supply function by an arbitrary constant K 2 and 
add. The result is the ‘‘pseudo-demand function": 


(8) (E x b xx -f E 2 b 2 x )y x T (^1^12 b~ ^2^22)V2 ~b ~ ~b K 2 e 2 


There is evidently no way in which we can distinguish this constructed 
"pseudo-demand function" (8) from the demand function (6) on the 
basis of our observations on y x , v 2 , and z x . The same variables appear 
in both equations. 

Next we multiply our demand function (6) with an arbitrary constant 
L x and the supply function (7) with an arbitrary constant L 2 and add. The 
result is: 


(9) (E x b xx f C 2 b 2x )y x 1 (L x b X 2 r E 2 b 22 )\ 2 b ^ x c xx z x — Ljfq ~b C 2 f 2 


This is our "pseudo-supply function." But now equation (9) can 
easily be distinguished from equation (7) if c n # 0. Under this condition 
we must have L x = 0. Only then will the ,rm of (9) be identical with 
(7), i.e., our supply function: It does not involve the variable z x . Hence 
the supply function (7) is identified if c xx ^ 0. 

Example 3. We use the notation of Example 2. Let us now denote 

by z 2 a cost factor (e.g., wages). Then the demand function is as in 
Example 2: 

0°) b xx y x + b x2 y 2 q- c xx z x = e x 


But the supply function becomes: 



b 2x v x q- 



~ 4 _ p — 
c 22- 2 



Evidently the sufficient conditions of identification are satisfied for 
both equations (10) and (II). Equation (10) contains the variable z x 
which does not appear in (11). Likewise, equation (11) contains the 
variable z 2 which does not appear in (10). The necessary and sufficient 



160 


ERRORS IN THE EQUATIONS 


[7.1 


condition for the identification of (10) is that c 22 7 ^ 0. The necessary 
and sufficient condition for the identification of (11) is that c n ^ 0. If 
this is the case, both equations are identified. 

We may again multiply (10) by the constant K x and (11) by the constant 
K 2 and add, in order to form the “pseudo-demand function”: 

(12) (K 1 b n + K 2 b 2 i)y\ -f- (K x b x2 + K 2 b 22 )y 2 -j- K 1 c n z l -f K 2 c 22 z 2 = K 1 e l -f K 2 e 2 

But on the basis of our knowledge of the observed variables y lt y 2 , z 1? z 2 
we can distinguish now the “pseudo-demand function” ( 12 ) from the 
demand function (10). This is possible if c 22 0. Then we see that 

the demand function ( 10 ) involves the variables^, y 2 , z x , but the “pseudo¬ 
demand function” (12) involves these variables and also z 2 . We must 
have L 2 = 0. Hence equation (10) is identified if c 22 ^ 0. 

Similarly we multiply (10) by L x and (11) by L 2 in order to form the 
“pseudo-supply function”: 

(13) (L l b u 4- L 2 b 2l )y l -f- (L x b 12 4- L 2 b 22 )y 2 4 - L x c n z x 4- L 2 c 22 z 2 = L x e x 4 - L 2 e 2 

We know only our observations on y lf y 2 , z lf z 2 . We also know that 
the supply function must be of the form ( 11 ), i.e., it involves onlyjj, y 2 , 
and z 2 . Let us assume that c n 7 ^ 0. Then the “pseudo-supply function” 
(13) is of the same appearance as the supply function ( 11 ) only if we have 
L x — 0. Hence the function (11) is identified, under the condition that 
c n =/~ 0. 

Example 4. We consider now a more elaborate model for a static 
market of two commodities. Let y x be the quantity of commodity T, 
v 2 the quantity of commodity B. Further, y 3 is the price of commodity 
A , and y A the price of commodity B. All these are endogenous or jointly 
determined variables. 

Let z x be income, z 2 a cost factor in the production of commodity A. 
Further, z 3 and z 4 are two cost factors in the production of commodity B . 
These are our exogenous variables. 

The demand function for commodity A is: 

O 4 ) b n )\ + b l 3 y 3 -f b H y t + = f! 

The demand function for commodity B is: 

(15) ^ 22)2 ^ 23)3 T - b 2A v x 4" c 2 iZj = < c 2 

The supply function of commodity A is: 

(16) b 3l y x 4 ~ ^ 33 J 3 4 " c 32 z 2 = e 3 

The supply function of commodity B is: 

( ^ 7) ^42}'2 ^4434 ^43 Z 3 4“ **44 Z 4 = f 4 



7.1] 


IDENTIFICATION 


161 


We consider first the identification of the demand function for com¬ 
modity A, namely (14). 

We have G = 4, since there are four equations and four endogenous 
variables. Hence we must have at least G — 1=3 variables which enter 
into the system but not into equation (14). There are indeed four such 
variables, namely y 2% z 2 , z 3 , z 4 . Hence the necessary condition for identi¬ 
fication is fulfilled. 

The necessary and sufficient condition for identification is as follows: 
We must be able to find at least one non-zero determinant of order 
G — 1 = 3 which can be formed from the coefficients of the variables 
which enter into the system but not into equation (14). These coefficients 
are presented in Table 1. 





TABLE 

1 

Equation 



Variables 




v 2 

Z 2 Z 3 Z l 

(15) 


b 22 1 

0 0 0 

(16) 


0 

c' 32 0 0 

(17) 


6-12 1 

^ C *3 C 44 

The determinants of order 3 which can be formed from 


b 2 2 

0 

0 


(18) 

0 

C 32 

0 

' b 2 o^32 C i3 


A 

"42 

0 

<*43 



b 22 

0 

0 


(19) 

0 

<32 

0 

= b 22 C 32 C 44 


b\2 

0 

<44 



^22 

0 

0 


(20) 

0 

0 

0 

= 0 


b i2 

<*43 

<*44 




162 


ERRORS IN THE EQUATIONS 


[7.1 



The determinants (20) and (21) are zero. Hence it follows that equation 

(14), the demand equation for commodity A, is identified if either (18) 
or (19) is not zero. 

We consider next the demand equation for commodity B y namely (15). 
The necessary condition for identification of the equation is satisfied if 
there are at least G — 1 = 3 variables which enter into the system but 
not into equation (15). There are indeed four such variables, namely 
y 1» -‘-2’ z 3 > 

In order to have the necessary and sufficient condition for identification 
satisfied we must be able to form at least one non-zero determinant of 
order 3 from the coefficients of the variables which enter into the system 
but not into equation (15). These coefficients are given in Table 2. 


TABLE 2 


Equation 



)'i 

(14) 

b n 

(16) 

b-si 

(17) 

0 


Variables 


*2 

0 


32 


0 


Z 3 

0 

0 

r 43 


0 

0 


44 


From this array we have to form at least one non-zero determinant of 
order 3. The determinants are: 


( 22 ) 


ii 


31 


0 


0 


32 


0 


0 


0 


43 


- t) i i CnoC 


11*- 32'- 43 


(23) 


ii 


31 


0 


0 


32 


0 


0 


0 


44 


— b\\ C 32 C AA 



7.1) 


IDENTIFICATION 


163 



Hence the demand equation of commodity B, namely (15), will be 
identified if at least one of the expressions (22) or (23) is not zero. 

We consider now the identification of the supply equation of com¬ 
modity A, namely (16). We must again have at least 6 -1 =3 variables 
which appear in the system but not in this equation. There are indeed 

now five such variables, namely j,, , 4 , Zj> and _- 4 . Hence the necessary 
condition for identification is fulfilled. y 

For an investigation of the necessary and sufficient condition of the 
identification of equation (16) we make the array of the coefficients of 
the variables which enter into the system but not into equation (16)- 


Equation 

(14) 

(15) 
(17) 


TABLE 3 


0 

b., 2 

b i2 


Variables 


V 4 

b 


li 

24 


14 


*1 

<11 

<21 

0 


z .\ 

0 

0 


<43 


-4 

0 


0 


c 


M 


One of the determinants of order 3 which cm he r„ . r 
array must not be zero. The determinants are ^ ' h ' S 


(26) 


0 


22 


^42 


14 


bct.-t b, 


24 


44 


11 


Co 


21 


0 


h u ( 2l h i2 + l> 2 >b u c n - b t .,b 2l c. 



164 


ERRORS /N THE EQUATIONS 

[7.1 

0 

*14 

0 


(27) b 22 

*24 

^ = C 43*14*22 


*42 

b\\ 

<43 


0 

*14 

0 


(28) b 22 

*24 

0 = c 44*14*22 


1 *42 

*44 

<*44 


*14 

^11 

0 


(29) b 2A 

C 21 

0 = ^44(^14^2! ^ 11 * 24 ) 


*44 

0 

<44 


0 

C 11 

0 


(30) b 22 

C 21 

0 | = c 43 c ll*22 


b\ 2 

0 

C 43 


0 

I 

^11 

0 


(31) b 22 

C 21 

0 C 44 C 11*22 


b\ 2 

0 

<44 


0 

0 

0 | 


(32) b 22 

0 

0 =0 


*42 

C 43 

<44 




7.1] 


IDENTIFICATION 


165 




0 

0 = C 43(^]4 C *21 ~ C 11^24) 

C i3 


c 1 , 0 0 


(34) | c 21 0 0=0 

i ® c rs c 4i 

Hence we come to the conclusion: At least one of the seven deter¬ 
minants (26), (27), (28), (29), (30), (31), (33) must be different from zero 
in order for the supply of equation of commodity A (16) to be identified. 

We consider finally the identification of equation (17), which is the 
su PP‘y function of commodity B. We need again G — 1 = 3 or more 
variables which appear in the system but not in equation (17). There 
are no less than four such variables: ;- 3 , z„ z 2 . Hence the necessary 

C °" d '“ M j for the identification of the supply function of commodity B 
is fulfilled. 3 

In order to have the necessary and sufficient condition for identification 
satisfied we must be able to form at least one non-zero determinant of 
order G 1—3 from the coefficients of the variables which do not 
appear in equation (17). We give an array of these coefficients in Table 4. 

TABLE 4 

Variables 

yi ys *1 z 2 

b n b i % <*n 0 

® b 23 **21 0 

^31 633 0 C 32 

i„ ™tTr;“ n fo l ,o„" d “ 3 whieh can be r °™ cd fro ” ih ' «•«*««» 

^13 C X1 


23 C 2l — b l3 c 2l b 3l — 
b 33 



Equation 


(14) 

(15) 

(16) 


0 



166 


ERRORS IN THE EQUATIONS 


[ 7.2 


bn b l3 0 

(36 > 0 6,3 0 = c 32 b 23 b ll 

^3i ^33 C 32 I 

^13 C 11 0 

b 23 c 2l o = c 32 (b 13 c 2l — c n b 23 ) 

^33 0 ^*32 

t>n c n 0 

^ ^*21 0 = ^32^11^21 

b 3i 0 c 32 

In order for equation (17) to be identified at least one of the deter¬ 
minants (35), (36), (37), (38) must be different from zero. 

7.2 Estimation of Equations Which Are Just Identified 1 

We consider now a single equation out of the total system described 
in equation (1) of section 7.1. We deal here with deviations from the 
means. The constants b i0 can be estimated by the conditions that the 
best fit must go through the means of all the variables. 

There are G equations in our system or model. 2 We have G endogenous 
variables, y 1 • • • y G , and K exogenous variables, z l • • • z K . Let now 
*i ' • • *// (H G) be the endogenous variables which actually appear 
in our equation, and r 1 • • • r (J _ H the endogenous variables which appear 

1 T. C. Koopmans: “Statistical estimation of simultaneous economic rela¬ 
tions,” Journal of the American Statistical Association, vol. 40 (1945), pp. 448 ff. 

2 T. Haavelmo: “Methods of measuring the marginal propensity to con¬ 
sume,” Journal of the American Statistical Association , vol. 42 (1947), pp. 105 ff.; 
“Quantitative research in agricultural economics: The interdependence between 
agriculture and the national economy,” Journal of Farm Economics, vol. 29 
0947), pp. 910 ff. G. Cooper: “The role of econometric models in economic 
theory,” ibid., vol. 30 (1948), pp. 101 ff. T. W. Anderson and Herman Rubin: 
“Estimation of the parameters of a single equation in a complete system of 
stochastic equations,” Annals of Mathematical Statistics, vol. 20 (1949), pp. 46 ff. 





7 . 2 ] 


JUST IDENTIFIED EQUATIONS 


167 


in the system but not in the equation. Let further w, • • • u F (F K) 
be the exogenous variables which appear in the equation in question. 
Further Vj v D are the exogenous variables which appear in the system 
but not in the equation which we want to estimate (F + D = K). Then 
the equation which we want to estimate can be written: 




F 


2 bjX; + 2 c jUj = 

j=l 



where b t and c, are subsets of the constants b ij and c u ; e is a random 
variable. 

The reduced form of the system is derived in the following way: Since 
there are in our complete system G equations in G endogenous variables, 
we can solve for the G endogenous variables in terms of the exogenous 
variables. The reduced form equations which are pertinent to the 
endogenous variables in our structural equation (1) are as follows: 



F D 

x i = .2 A a Uj -f 1 B ij v j -f rj i 


j = i 


j = i 



These are the reduced form equations for the endogenous variables 
*i ’ * ‘ x n wh,ch actually appear in the structural equation (1). 


(3) 


F 1) 

r i 2 Cij u i 4" 2 4“ f ■ 

J=1 j =1 


(/ = 1, 2, • • • G- H) 


These are the reduced form equations for the endogenous variables 

which do not actually appear in the structural equation (1) but are in the 
system r x • • • r a _ H . 

The random variables , h and are evidently linear combinations of 
the errors in the equations, e t . 

Estimates for the regression coeff.cients appearing in the reduced form 
equations (2) and (3) arc found by using the classical least squares re¬ 
gression method. This method has been described in section 5J. 

Equation (1) is just identified if the rank of the matrix [S I \s H - 1 
This is to say, we must have D = H- I, in order that (I) should be 

whST?'?" 0) ' S J USt identified if the numbe >- of exogenous variables 

number“Tend Sy$tem If int ° £qUati ° n (l) is 1 less than the 

number of endogenous variables which are in the equation Fquation 

nto'the Ver i , ent h fied ‘ f tHe nUmb6r ° f ex °g eno ^ variables which enter 

number of en^ ^ mt .°' hc equation is ec i ual to or Skater than the 
number of endogenous variables which enter into the equation 



168 


ERRORS IN THE EQUATIONS 


[ 7.2 


If (1) is just identified we multiply (2) by b t and sum. This gives: 

H H F HD H 

( 4 ) 2 = 2 2 b { A iiUj + 22 b i B ij v j + 2 b iVi 

1 = 1 *=1 j= l i=l j =l i=l 

But this must be identical with (1). Hence we have for the estimation 
of the structural coefficients b { and c, the following relations in the case 
of structural equations which are just identified: 

(5) 0 = - 2 bjBji (/ = 1, 2, • • • D = H- 1) 

3 = 1 

The linear homogeneous system (5) has a determinant which is of 
rank H — 1. One further condition must be imposed for normalization, 

H 

e.g., b x = — 1, or 2 b? = 1 (Appendix A. 1.2). 

i=i 

The remaining structural coefficients c t are computed from the formula: 

(6) c t - = - f M* ('= 1,2, •• - F) 

j= i 

From this it follows that in the case of a structural equation which is 
just identified but not over-identified we use the classical least squares 
method in attempting to find estimates for the reduced form equations 
(2). But in this case there is a one to one relationship between the struc¬ 
tural coefficients and the regression coefficients of the reduced form 
equations. Hence we use the transformations (5) and (6) in order to 
estimate the structural coefficients themselves. 

The distribution of the regression coefficients in the reduced form 
equations is known. 3 It may be used to find the distributions of the 
estimates of the structural coefficients. 

A method which takes account of the time series nature of the data 
will be presented in section 11.1.2, Example 1. 

Example 1. Denote by Y 1 the quantity of meat consumed, by Y 2 the 
price of meat, by Z x the per capita disposable income, by Z 2 the cost of 
processing meat. 4 A more detailed description of these observed variables 
will be given in connection with Example 1 of section 7.3. The following 
system is analyzed: 

(7) b u y x + b l2 y 2 + c n z x = ^ 

(8) b 21 yi -f- b 22 y 2 + c 22 z 2 = e 2 

3 S. S. Wilks: Mathematical Statistics (Princeton, 1943), pp. 234 ff. 

4 G. Tintner: “Static econometric models and their empirical verification, 
illustrated by a study of the American meat market,” Metroeconomica y vol. 2 
(1951), pp. 172 ff. 



7 . 2 ] 


JUST IDENTIFIED EQUATIONS 


169 


The small letters designate the deviations from the means. Equation 

(7) may be considered the demand function for meat, and equation (8) 
the supply function. 

We have used annual data from the American economy, 1919—41 
The arithmetic means of the variables are represented in Table 1. 

TABLE 1 
Means 

Y \ Quantity of meat consumed 166.1913 

V 2 Price of meat 92.3391 

z i Per capita disposable income 495.5652 

Z 2 Cost of processing meat 88.4217 

The sums of squares and cross products which are required in our 
analysis are given in Table 2. 


1 rtDLC Z 


Sums of Squares and Products Si 


Y i 

t 2 


Y i 

1,369.53826 


- 352.55217 
1,581.49478 


z x 

3,671.91304 

8,354.59130 

83,433.65217 


- 536.47565 
850.33044 
3,611.71739 
2,534.79913 


We will deal separately with the demand and the supply equation 
(a) Demand for Meat. Equation (7) is the demand equation. We 
have G - 2, since there are two endogenous variables y x and y 2 in our 
system. The variable r 2 appears in (8) but not in (7). Hence the neces¬ 
sary condition of identification is fulfilled, since we have one (C7 - 1 = 
2 1) variable which appears in the system but not in the equation 

The necessary and sufficient condition for identification requires that 
^22 ' n 

Now it can be seen that equation (7) is just identified. The number 
endogenous variables which appear in equation (7) is 77 = 2 The 
endogenous variables are * and y t . The number of exogenous variables 

D 1 aP rh' f 10 1 u s y slem equations but not in our equation <7) is 

' l hl$ 15 the variable Hence we have D~H-\ and 
equation (7) is just identified. 

In the notation of the present section, we put y, = v = * ,, _ y 
v i ~ z 2 - Equation (1) appears as follows: 2 2 * 1 19 


( 9 ) 


bi*i -f b 2 x 2 -f c 1 u l = e 


In order to find the reduced form equations we must express and x 

* t 



170 


ERRORS IN THE EQUATIONS 


[ 7.2 


in terms of the exogenous variables and [formula (2)]. Tnis is a 

problem in classical regression theory based on the method of least squares 

The data for the solution are taken from Table 2. We have to solve the 
system of equations: 


(10) 83,433.65217 A n + 3611.71739£ u = 3671.91304 

3611.71739^ n + 2534.79913£ u = - 536.47565 

This gives the solutions A n = 0.0566668 and B n = — 0.2923865. 

The second system of equations is: 

(12) 83,433.65217 A 21 + 3611.71739£ 21 = 8354.59130 

(13) 3611.71 739A 21 + 2534.79913 B 2l = 850.3304 

The solutions of this system of equations are A 2l = 0.0912396 and 
B 2l = 0.2054593. 

The reduced form equations (2) for the endogenous variables x x and x 2 
which actually appear in equation (7) are: 

(14) x 1 = 0.0566668wj - 0.2923865 

(15) * 2 = 0.0912396wj + 0.2054593 V! 

For the estimates of the structural coefficients b l and b 2 we have from 

(5): 


(16) - 0.292386&! + 0.2054593^ = 0 

We normalize our equation by making b x = — 1. This gives us the 
estimate b 2 = — 1.4230872. 

For the estimate of the structural coefficient c x we have from (6): 


(17) c*! = - (0.0566668X— 1) - (0.0912396)(- 1.4230872) = 0.1865084 


Hence our estimate of the structural equation (7) becomes finally: 


(18) *i = - 1.4230872;c 2 + 0.1865084 Wl 

If we take into account that the best fit must pass through the arithmetic 
means of all the variables, we have as an estimate of the demand function 
for meat: 


(19) Yi = ~ 1.4230872 r 2 + 0.1865084Z! + 205.1708 

We can use the means given in Table 1 to compute from this relationship 
the average elasticities valid at the means: The price elasticity of the 
demand for meat is —0.791, and the income elasticity is 0.556. 

These results are to be interpreted in the following way: Assume that 



7 . 2 ] 


JUST IDENTIFIED EQUATIONS 


171 


conditions are approximately the same as in the interwar period analyzed, 
and that other things remain equal. If the price of meat increases by 
I per cent the demand for meat might be expected to decrease by about 
/s °f 1 per cent. On the other hand, if, under similar conditions, ceteris 
paribus , per capita disposable income increases by 1 per cent, then the 
demand for meat will increase by about >/ 2 per cent. The importance 
of these results for economic policy is obvious. They should be taken 
into account if, for instance, the government decides to fix the price of 
meat, or to tax or subsidize the sale of meat. 


But the results of our analysis ought to be applied in questions of 
economic policy with due caution. There are certain shortcomings of 
our results: (I) Our model consists of only two equations and does not 
show the relationship between the market for meat and other markets 
and the total economy (section 3.7). (2) Our model is static and repre¬ 

sents only very imperfectly the relationships which presumably exist in 
a dyntifme economy like the economy of the United States (section 3.8). 

(3) Our data have been assumed to be perfect and not to be subject to 
errors of observations, etc., which give rise to errors in the variables 
(section 6.5). But we know that there are rather larae errors of observa¬ 
tions in the senes analyzed, especially in the consumption series. More- 
over the mchces used represent only poorly the quantities and prices 
which refer to specific types of meat. Hence our assumptions of the 
absence of errors in the variables are in all probability not justified. 

(4) There is also no consideration of the possibility of autocorrelation of 
successive observations. Problems of this nature will be treated in Part 3 

(b \ Su PP'y °J Meat. We consider now equation (8). This is the 
supp^ function of meat. We have C = 2 endogenous variables in our 

fulfill^ V 1 and ' V2 ‘ The necessar y condition of identification is 

fulfilled since there ,s C — I = | variable which enters into the system 
but not into (8). This variable is z,. The necessary and sufficient 
condition for identification of our equation is that c u 0. 

t is again easy to see that our equation (8) is just identified. The 

an dr' ^m^hT^^b Van f ab ‘ eS m ° Ur ec l uation is H = 2, namely Vl 
", h nUmbCr ° f ex °S enous variables which enters into the 

system but no, into the equation is = 1 . This exogenous va able i 

We cZr * = "- 1 ’ 3nd CC i uatio " m identified. 

«h,s seefion g and°^v n e 0t : t,0n ^ * COn !° rm,ly W " h ** n ° tat ‘°" ° f 
with the analysis of the demand funJmn^r 

ZXT/’T- BUt f ° rma " y W,,h the Chan ^ e we 2 

‘Lain to estimate a structural equation of the form (9) 

The computation „f the reduced form equation is similar to the resttl,, 



172 


ERRORS IN THE EQUATIONS 


[ 7.3 


in ( a ), but we have to exchange the role of the A u and B u . This follows 
from the fact that z x and z 2 have changed places. 

Hence the reduced form equations (2) are now: 

(20) *i = - 0.2923865^ + 0.0566668V! 

(21) x 2 = 0.2054593^! + 0.0912396 v 1 

The structural coefficients b t connected with the endogenous variables 
are estimated from (5): 

(22) 0.0566668^ + 0.09123966 2 = 0 

We normalize again by putting b 1 = — 1. This gives b 2 = 0.6210768. 
The remaining structural coefficients c { are estimated with the help of 
( 6 ): 

(23) <?!=-(- 0.2923865)(- 1) - (0.2054593)(0.6210768) = - 0.4199919 
Our estimate of the supply equation for meat is finally: 

(24) Y x = 0.6210768 Y 2 — 0.4199919Z 2 -f 145.9780 

We can again compute the elasticities of supply. The price elasticity 
of supply is 0.345, and the cost elasticity is—0.223. These elasticities 
are evaluated at the means of all the variables given in Table 1. 

The interpretation of these elasticities of the supply of meat are as 
follows: Let us assume that conditions are on the whole similar to the 
ones prevailing during the interwar period analyzed. There is no change 
in production methods or cost conditions. If, ceteris paribus , the price 
of meat increases by 1 per cent, we may expect the supply of meat to 
increase by about 1 / 3 of 1 per cent. If, under similar conditions, other 
things being equal, the cost of processing meat increases by 1 per cent 
the supply of meat will decrease by about 1 / 5 of 1 per cent. 

The importance of these results is potentially great for applications in 
economic policy. They should be taken into account if the government 
decides, for instance, to level a tax on the sale of meat, to buy meat (e.g., 
for relief purposes), or to change the cost of processing meat (e.g., through 
measures of wage policy). 

There are, however, again severe limitations on the practical applica¬ 
bility of our results. These are essentially the same as the difficulties 
discussed above in connection with the demand function for meat. 

7.3 Estimation of a Single Over-identified Equation 

The methods given in section 7.2 are not applicable in the case when 
the structural equation which is to be estimated is over-identified . Then 



7.3] 


OVER-IDENTIFIED EQUATION 


173 


we have the case in which the rank of the matrix [B n ] is greater than 
H — 1 or when D > H — 1. 

.The number of endogenous variables which appear in the equation in 
question (//) is equal to or smaller than the number of exogenous variables 
which are in the system but do not appear in the equation (D). 

Methods have been given 1 for the estimation of the complete system 
of structural equations in this case. They are, however, very complicated 
and computationally difficult. 

If we want only to estimate the coefficients in one (or a few) structural 
equations of the complete system, we may utilize methods which are 
not completely efficient. 2 They do not take all the identifying conditions 
into account, and they deal only with one equation at a time. But they 
lead to consistent estimates (section 5.2). 

We use the method of maximum likelihood (section 5.2). The esti¬ 
mates given by the method of maximum likelihood will tend in probability 
to the true population values of the structural coefficients as the number 
of observations which constitute the sample becomes larger and larger. 
We have of course to assume that the errors in the equations are normally 
distributed, are independent, and have no autocorrelation and serial 
correlation. 

Let the equation to be estimated be of the form of equation (1) in section 

7.2. The method of maximum likelihood leads to the following proce¬ 
dures: 


1 F.. J. Marschak, L. Hurwicz, T. Koopmans, R. B. Leipnik: “Estimating 
relations from non-experimental observations," abstracts of papers Econo¬ 
metric^ vol. 14 (1946), pp. 165 ff. J. Marschak: “Statistical inference in 
economics,” in T. C. Koopmans, ed.: Statistical Inference in Dynamic Economic 
Models (New York, 1950), pp. 1 ff. T. C. Koopmans, H. Rubin, and R. B. 

Leipnik: “Measuring the equation system of dynamic economics,” ibid, pp 
153 ff. ’* 

- T. W. Anderson and H. Rubin: “Estimation of the parameters of a single 
equation in a complete system of stochastic equations,” Annals of Mathematical 
Statistics, vol. 20 (1949), pp. 46 ff. L. R. Klein: "The use of economic models 
as a guide to economic policy," Econometrics, vol. 15 (1947), pp. 111 ff M A 
Girshick and T. Haavelmo: “Statistical analysis of the demand for food: 
Examples of simultaneous estimation of structural equations,” ibid., vol 15 
(1947), pp . 79 ff T. W. Anderson and H. Rubin: "The asymptotic properties 
of the parameters of a single equation in a complete system of stochastic equa¬ 
tions, Annals of Mathematical Statistics, vol. 21 (1950), pp. 570'ff T W 
Anderson: “Estimation of parameters of a single equation by the limited 

in ormation method,” in T. C. Koopmans, ed.: Statistical Inference in Dynamic 
Economic Models (New York, 1950), pp. 31 1 ff. 



174 


ERRORS IN THE EQUATIONS 


[ 7.3 


We form the matrices: 


N 

Z XitXjt 


0, y = 1,2, 


H) 


( 1 ) 


l) 


( 2 ) 


Ft, 


(3) 




(4) 


"t, 


(5) 


•4 


( 6 ) 


K 

IX tJ 


_ t= 

1 


N 

N 


Z 

Wit 

_ t=_ 

i 


N 

N 


Z 

XitVjt 

_ t = 

L 


N 

N 


Z 

“it” it 

t= 1 


N 

AT 


Z 

Wit 

<= l 


N 

A' 

Z 

Wit 

t= 1 



0 = 1,2,- • • H; j= 1,2, 


F) 


(‘= 1, 2, ■ ■ ■ H; j= 1, 2, 


• • 


D) 


(i,j= 1, 2, • • • F) 


0 = 1,2,- • •/•;/= 1,2, 


7>) 


0, j =1,2 ,•••£>) 


JV 


Instead of the variables • • 
which is orthogonal to the : 


v D we use a set of variables • • ■ s D 


(7) 


F F 

Sit = »« - Z Z 0 = 1, 2, 

r —1 *=1 




// ra is now the element of the matrix which is inverse to the matrix [// rs ] 
(4). (See Appendix A. 1.1.) 


We form the new matrices: 

(8) L ti = G ti -22 F, 

r=1 s=l 
F F 

(9) M t} = K„ - 2 2 JnH T ’Js, 

r — 1 * = 1 

The reduced form becomes now: 


0 = 1 , 2 ,- • • H\ 1 , 2 , 


0,y= 1, 2, • • • D) 


D) 


( 10 ) 


F D 

X it = Z N ,iU,t + Z + 7] it (I = 1, 2, 

r-l j=i 


H) 


The matrix B if is estimated by the set of equations: 



7 . 3 ] 


OVER-IDENTIFIED EQUATION 


175 


( 11 ) 


— L ik (/ = 1, 2, • • • D; k = 1,2, 




The quantities are given by the system of equations: 


( 12 ) 


2 N u H jk _ ,F rt 

i=i 


(/ = 1, 2, • • • F; k= 1 , 2, 


#) 


We introduce a new set of values P it by the formula: 


(13) 


p ii= 2 2 5, r A/ rj fi Js = § s tr L 

r = l « = 1 r = 1 


(/,;= i,2, • • ■//) 


The matrix of the residuals is given by: 


(14) Q„ = E ti - 2 | B ir M„B„- i 2 N iT H TS N u = 

r = 1 * = 1 r = l «= 1 


E » ~ p i, ~ 2 


r = l 


(/,;= i,2, • • • //) 


The estimates of the structural coefficients are given by the system 
of linear equations: 


05) 


j=i 


(F« - J'e,,)^ == o (/ = 1, 2, • • • H) 


This linear homogeneous system can have a non-trivial solution only 
if its determinant is equal to zero: 3 


(16) 


Pii-'Qi, 1=0 (/, j — l, 2, • • • H) 


We have to choose the smallest root v of this equation for the maximum 

likelihood solution. Then, substituting the smallest root into (15), we 

obtain estimates for V • • b fI in equation (1) of section 7.2 after nor- 
malization has been introduced. 

The estimates of the structural constants c, are given by: 


(17) 


= - 2 b,N st (/= 1,2, 

j=l 


n 


The similarity between these estimation methods and the procedures 

ol many forms of multivariate analysis which have been presented in 
^napter 6 is apparent. 

rea ^ r is referred to A PPendix A.2 for computational methods 

single oTr id ^^ eSt,mates of the struc tural coefficients of a 

•'ingie over-identified equation. 

A test jor the restrictions assumed for identification is also possible. 



176 


ERRORS IN THE EQUATIONS 


[ 7.3 


We test the hypothesis that the rank of the matrix [B fJ ] is H against the 
hypothesis that it is H — 1. The test criterion is: 

(18) N log e (1 + }; ) 


The expression (18) has for large samples the % 2 distribution with 
D — H + 1 degrees of freedom. 

Confidence intervals for the estimated structural coefficients can be 
computed as follows: 

The expression: 


(19) 


s 

2 
i=i 


2 bib, f 2L, r M”zJ (N-K) 

} = 1 r = 1 « = 1 J 


JJ fj 

»2 2 b&Qv 

i=i j =i 


follows the variance ratio ( F ) distribution with D and N — K degrees of 
freedom. This expression (19) may be used to find confidence intervals 
for the coefficients b { . 

The expression: 



( 20 ) 


H D D H F F B 

2 b.b, 2 2 FirH r ‘F u + 2 2VA-1 + 2 lc i b l F li + 

j = 1 r = 1 8 = 1 1 = 1 j = l i = 1 j = 1 

F F H H D D "] 

2 lc iCi H it + 2 2 Kb f 2 A (N-K) 

i-1 j-l 1 = 1 j = 1 r = l # = 1 J __ 


B B 

K 2 2 bfiiQ, 

1=1 i=i 




follows the variance ratio ( F) distribution with K and N — K degrees of 
freedom. This formula may be used to compute confidence limits for 
the coefficients b { and c,. 

Example 1. We give an example similar to the one utilized to illustrate 
the estimation of just identified equations (section 7.2, Example 1). Let 
us denote by Y x the quantity of meat consumed, by Y 2 the price of meat, 
by Zj the per capita disposable income, by Z 2 the cost of processing meat, 
and by Z 3 the cost of producing agricultural products. The structural 
equations to be estimated are as follows: 


(21) b ll y l + b l2 v 2 + = £i 

(22) b 2l y\ + b 22 y 2 + c 22 z 2 + c^ = e 2 


Small letters designate again the deviations of the variables from their 
means. Equation (21) may be considered the demand function for meat, 
and equation (22) the supply function of meat. We have G = 2 



7 . 3 ] 


OVER-IDENTIFIED EQUATION 


177 


endogenous variables, y 1 and y 2 , and K = 3 exogenous variables, z„ z 2 , 
and z 3 , in our system. 

We will concentrate on the estimation of the demand equation (21). 

The necessary condition of identification is fulfilled, since there are more 

than G— 1 = 1 variables which enter into the system but not into equation 

(21). In fact, we have two such variables in equation (22), namely z 2 

and z 3 . The necessary and sufficient condition for identification will be 
fulfilled if either c 22 =£ 0 or c 23 =£ 0. 

But equation (21) is over-identified. This may be seen in the following 
way: There are H = 2 endogenous variables, y\ and y 2 , which enter into 
equation (21) which is to be estimated. But there are now D — 2 
exogenous variables which are in the system but do not appear in equation 
(21). These variables are z 2 and z 3 . But for identification we need only 
H— \ = \ exogenous variable which is in the system but not in the 
equation. Hence the equation in question is over-identified. 

The means of the variables are given in Table 1. 


TABLE 1 

Arithmetic Means 

Vi Quantity of meat consumed 166.1913 

Y 2 Price of meat 92 3391 

Z l Per capita disposable income 495.5652 

Z 2 Cost of processing meat 88.4217 

Z z Cost of producing agricultural products 102.2174 


All data are annual. We have N = 23 observations, from the years 
1919 to 1941. The data are described in detail as follows: 3 

f'l. Per capita consumption of meat, measured in pounds. 

This is a sum of the per capita consumption of meat, poultry, and fish 
Complete data were not available on fish consumption before 1930, so 

ioin ake In!/"' 65 com P lete an aver age consumption figure of fish from 
1930 to 945 was used for the period 1919-41. Consumption of fish 

varied only from 8.2 pounds to 12.0 pounds during the period mentioned. 

Meat. p. .38, 1949 Outlook , Bureau of Agricultural Economics U.S 
Department of Agriculture. 

Poultry: Agricultural Statistics, 1944, p. 397, Table 512, U.S Depart- 

ment of Agriculture, 1919-41. ^ 

Fish: Commercial Fisheries Review, vol. 10, No. 11 (November, 1948), 


,, L ' l / ench: Applications of Simultaneous Equations to the 
h Demand .for Meat (unpublished thesis, Ames, Iowa, 1950). 


Analysis of 



178 


ERRORS IN THE EQUATIONS 


[ 7.3 


Washington, D.C., U.S. Department of Interior, Fish and Wildlife 
Service. 


Y 2 : Retail prices of meat, 1935-39 = 100. 

Bureau of Labor Statistics Index Price Series, including poultry and 

fish, was used and deflated by the Index of Consumer Prices for Moderate 
Income Families in Large Cities. 

“Handbook of labor statistics,” Bulletin 9 16, U.S. Department of Labor, 
Bureau of Labor Statistics, 1947 edition, Washington, D.C., Table 3-4* 

p. 121. 


Z x \ Per capita disposable real income, dollars. 

This series used Personal Disposable Income, and this was placed on 
a per capita basis by dividing by the population estimates of the United 
States. The figures were then further corrected by division by the Index 
of Consumer Prices for Moderate Income Families in Large Cities. 

From U.S. Department of Commerce data and Bureau of Agricultural 
Economics estimates based on such data; Agricultural Outlook Charts , 
1948, B.A.E., U.S. Department of Agriculture, p. 10; Marketing and 

Transportation Situation , B.A.E., U.S. Department of Agriculture, August, 
1948, p. 20. 


Z 2 : Cost of processing meat, 1935-39 = 100. 

This series is the Unit Labor Cost in the Slaughtering and Meat Packing 
Industry, corrected by dividing by the Index of Consumer Prices for 


Moderate Income Families in Large Cities. 

“Handbook of labor statistics," Bulletin 916, U.S. Department of Labor, 
Bureau of Labor Statistics, 1947 edition, Washington, D.C., Table F-2, 
p. 158, for 1919-39. 

Data for 1940-41 from U.S. Bureau of Labor Statistics, Productivity 


and United Labor Cost in Selected Manufacturing Industries , 1939-44 , 
p. 6, May, 1945 (processed). 


Z 3 : Cost of agricultural production, 1935-39 = 100. 

Production Expenses of Farm Operators, United States, 1910-47, was 
used as the basic data, using total production expenses. This was indexed 
and then deflated by the Index of Consumer Prices for Moderate Income 
Families in Large Cities. 

Source: Estimates obtained from the Bureau of Agricultural Economics. 
Unpublished. 

The sums of squares and products of our variables are needed for the 
subsequent computations. They are given in Table 2. 



7 . 3 ] 


OVER-IDENTIFIED EQUATION 


179 


TABLE 2 

Sums of Squares and Products 



n 

^2 

z. 

z 2 

Z, 

n 

1/ 

1,369.53826 

- 352.55217 

3,671.91304 

- 536.47565 

983.86348“ 

^2 

Zt 

z., 


1,581.49478 

8,354.59130 

83,433.65217 

850.33044 

3,611.71739 

2,534.79913 

1,235.76435 

12,204.77391 

730.78130 

.3 





2,626.99304 | 


We use the notation of this section. Hence we have jc, = y lt x 2 = y , 

Ul t T- Zl ’ Vl = Z2 ’ = Z * m Ec l uat,on ( 2I ) which is to be estimated becomes 

in this notation: ,,Cb 

+ b 2 x 2 H- c l u l = e 

We compute the elements of matrix E from formula (1): 

TABLE 3 
Matrix E 


x. 


X. 


59.545 


15.328 

68.761 


This matrix is symmetrical. Hence we present for this and all other 
ymmetncal matrices only the elements above the principal diagonal 

elements inTable 2 * "* ^ b > hiding the corresponding 

T , . 2 ’ l e -> the sums of the products jr and x by yV — 23 

The elements ,n matrix / are formed in a similar way: 7 “ 

TABLE 4 
Matrix F 



159.648 

363.243 


The corresponding elements 

*« and u jy are Jivided by 23. 


in Table 2, i.e., the sums of the products 
we also compute matrix G: 


TABLE 5 
Matrix G 



- 23.325 
36.971 


v i 

42.777 

53.729 



180 


ERRORS IN THE EQUATIONS [ 7 # 3 

By dividing the sums of the products of u t and u j in Table 2 by N = 23 
we form matrix H: J 

TABLE 6 
Matrix H 

“i 

u x [3627.550] 

We divide the sums of the products of u { and v, in Table 2 by 23 and form 
matrix J: 

TABLE 7 
Matrix J 

v 2 

"i [157.031 530.642] 


Finally, by dividing the elements in Table 1 which are the sums of the 
cross products of v, and v, by 23, we have matrix K: 

TABLE 8 
Matrix K 

Vi v 2 

v, r 110.209 31.773] 

v 2 [ 1 14.217 J 


We use formula (8) to compute the elements of matrix L. We note 
that the inverse of the single elements of matrix H (Table 6) is H 11 = 
1/3627.550 = 0.000275668. Hence we have: 


(24) 

(25) 


L u — C7 n F n H n J n — 

- 23.325 - (159.648)(0.000275668)( 157.031) = - 30.236 
— ^12 F n H 11 J 12 = 

42.777- (159.648)(0.000275668)(530.642) = 19.423 


(26) 


T 2 j — G 2i F 2l H n J u — 

36.971 - (363.243)(0.000275668)( 157.031) = 21.247 


( 27 ) 


L 22 = G 


F^H n J,o = 


22 1 21 11 ° 12 


53.729 - (363.243)(0.000275668)(530.642) = 0.594 



7J ] OVER-IDENTIFIED EQUATION 

These results are collected in Table 9. 


181 


TABLE 9 
Matrix L 


30.236 

21-247 


19.423] 
0-594J 


Similarly we compute the elements of matrix M from formula (9): 


(28) 


(29) 


M n = K u -J n H"J n = 

110.209 — (157.031 )(0.CX)0275668)( 157.031) = 103.411 

M n = M u = K n - J n H u J l2 = 

31.773- (157.031 )(0.000275668)(530.642) = 8.803 


= K,„ - = 


(30) 

114.217 — (530.642)(0.000275668)(530.642) = 36.594 

These results are assembled in Table 10. 

TABLE 10 
Matrix M 


s 

s. 


103.41 I 


°2 

8.803 

36.594 


e q I n . l "S r (U, COmPm ' " ememS ° f B • -I* .ysKn. of 


(31) 

(32) 


^11^11 + ^21^12 ~ Al 
^12^11 + M 22 B 12 = L,., 


This system is in our case constructed from the data in Tables 9 and 10- 
(33) 


(34) 


103.411B U + 8.8025 12 = - 30.236 

8.802B n + 36.594B,, = 19.423 


The solutions of this system of equations are B .. = 
^i2 = 0.613673. 11 


— 0.344626, and 



182 


ERRORS IN THE EQUATIONS 



We have a similar system of equations for the regression coefficients 

.#21 a nd B 22 : 


(35) 

(36) 


^ 11^21 + ^ 21^22 = #21 
M12B21 + m 22 b 2 2 = l 22 


This is in our case, again from the data of Tables 9 and 10: 
< 37 ) 103.41 \B 2l + 8.802# 22 = 21.247 

( 38 ) 8.802B 21 + 36.594^22 = 0.593 

The solutions are now B 21 = 0.208346, and B 22 = - 0.033887. 

These results are assembled in Table 11. 


TABLE 11 
Matrix B 

**1 *$2 

Xi [—0.344626 0.613673] 

*2 L 0.208346 — 0.033887J 

The elements of matrix N {j are computed from formula (12): 
(39) H n N n = F u 

( 4 °) 3627.550= 159.648 

Hence N u = 0.044010. 

(41) H n N n = F 21 

(42) 3627.550 N 2l = 363.243 

We obtain N 2l = 0.100135. 

The elements of matrix N are assembled in Table 12. 


TABLE 12 
Matrix TV 


x 1 [0.044010] 

* 2 [O.IOOI 35j 

From equations (13) we compute the quantities P a : 

B\l = # 11#11 + # 12#12 ~ 

(- 0.344626X— 30.236) + (0.613673)(19.423) = 22.339 


( 43 ) 



7 . 3 ] 


OVER-IDENTIFIED EQUATION 


!33 



A2 — ^21 — ^11^21 + ^12^22 = 

(- 0.344626)(21.247) + (0.613673)(0.594) = - 6.958 
**22 = ^ 21^21 4 - 7 ? 22 jL 2 2 ~ 

(0.208346) (21.247) + (- 0.033887) (0.594) = 4.406 


Fney are assembled in Fable 13. 


FABLE 13 
Matrix P 


v i r 2 

*1 T-2.339 — 5 958 

*2 L 4.406 

finally, the values Q u are computed from formula (14): 

(46) = En ““ — = 

j9.545 — 22.339 - (0.044010)( 159.648) = 30.180 

(47) ^ I2 = ^2i = ^12 — P 12 — N u F 21 = 

- 15.328 - (- 6.958) - (0.0440l0)(363.243) = - 24.356 

(48) ^ 22 = ~ - P 22 — N 2l F 2l = 

58.761 - 4.406- (0.100135)(363.243) = 27.982 

These results form matrix Q and are assembled in Table 14. 

TABLE 14 
Matrix Q 


*1 T 30.180 

v 2 L 


Y 2 

— 24.356] 
27.982J 


lhls 1S lhe matrix of the residuals. 

lhe system of equations (15) 
structural coefficients b t : 


IS formed in order to determine the 


(49) 

(50) 


(22.339- 30.180,.)*, + (_ 6.958 + 24.356v)6 2 = 0 
(- 6.958 + 24.356v)6, + (4.406 - 27.982 j')6 2 = 0 


The system of linear e 


quations (49) and (50) is homogeneous. 


Hence 



184 


ERRORS IN THE EQUATIONS 


[ 7.3 


it can have a non-trivial solution only if its determinant (16) is zero. 
This gives the determinantal equation: 



(22.339- 30.180*) 


(- 6.958 + 24.356*) 



I (- 6.958 + 24.356*) (4.406 - 27.982*) | 

The smallest root of equation (51) is * = 0.129339. Inserting this 
root into equation (49), we have for the determination of the structural 
coefficients b , : 



18.436&J — 3.8086 2 = 0 


We normalize by making b 1 = — 1. Then we have b 2 = — 4.841. 

From equation (17) we have for the determination of the structural 
coefficients c t the following relationship: 

( zy . c i = ^1^11 ^2^21 = 

1 ' - (- 1)(0.044010) — (— 4.841)(0.100135) = 0.441 

Finally we assemble our results into the estimated structural equation 
of the demand for meat, taking the arithmetic means given in Table 1 
into account: 

(54) Yi = ~ 4.841 Y 2 + 0.44lZj + 394.014 

We can again compute the elasticities of the demand for meat by using 
the arithmetic means: The price elasticity is — 2.690, and the income 
elasticity is 1.315. This is to be interpreted in this way: 

Assume that conditions are on the whole very similar to those of the 
interwar period. Ceteris paribus , the price of meat increases by 1 per 
cent, and we may expect the demand for meat to decrease by almost 
3 per cent. Under similar assumptions, if per capita income increases by 
1 per cent, the demand for meat is likely to increase by more than 1 per 
cent. These results are tentative and subject to qualifications similar to 
the ones discussed above (Example 1 of section 7.2). 

In order to test for the restrictions assumed for the identification of 
equation (21) in our system we compute from formula (18): 

(55) N log, (1 + *) = (23) log, (1 + 0.129339) = 2.797 

The quantity (55) is asymptotically distributed like x 2 T) — H \ 

= 2-24-1 = 1 degree of freedom. But at the 5 per cent level we 
have a permissible x 2 of 3.841, on the 1 per cent level a x 2 of 6.635. Hence 
our empirical value (55) is not significant. It is likely that the conditions 
of over-identification assumed are fulfilled in the population which 
corresponds to our sample. 



Part 3 



Some Topics in Time Series 
Analysis 


In Part 2 we have presented certain methods which appear promising 
for econometric investigations. We have neglected, however, one par¬ 
ticular feature which makes the application of traditional statistical 
methods to economic data most difficult: This is the absence of inde¬ 
pendence of successive observations. This deplorable fact is a conse¬ 
quence of the phenomenon that most of our data come in the form of 

time series. 1 But economic data which are ordered in time can almost 
never be regarded as random samples. 2 

Consider, for instance, a price series. If the fundamental assumption 
of the independence of consecutive observations holds, we would have 
to assume that the prices in two consecutive time intervals are independent. 
This is obviously not the case. If it were so, we would be living in a kind 
of random economy.” In such an economy prices would, e.g., be 
determined by arbitrary numbers taken at random from a hat. Two 
consecutive prices of a commodity would be statistically independent 
A knowledge of the price in one time unit would help us in no way to 
predict the price in the next time unit. 

Such a situation is evidently very different from the actual facts of 
economic life Actually, consecutive prices are highly and positively 
correlated with each other. If this were not the case, economic life 
would be much more chaotic and uncertain than it is. It would be 
utterly impossible to plan for the future. It is not intended to imply of 
course^ that we can predict a future price with certainty if we know the 


(n™ W y™M 9»," pp" r ' M ‘“ v Tim ’ "• I 


I8S 



186 


SOME TOPICS IN TIME SERIES ANALYSIS 


past price history and other relevant economic facts. Such a completely 

deterministic economy would be just as unrealistic as the “random 
economy” discussed above. 

Economic life is actually neither completely random nor completely 
deterministic. It is statistical ; i.e., given the past history and other 
relevant facts, we may be able to predict the future only with a certain 
probability but not with certainty. Economics is more like the pheno¬ 
mena of statistical physics (e.g., statistical mechanics) than those of 
classical physics. 

The analysis of time series is a very important part of econometric 
methodology. 3 This is, however, a field which is much less well developed 
than the multivariate methods discussed in the second part of this book. 
Hence we will only present tentatively certain methods which seem to be 
applicable in this field. We cannot give a complete survey. The problems 
involved in time series analysis are frequently mathematically very difficult. 
There are many unsolved problems. Questions regarding estimation, 
tests of significance, tests of hypotheses are still open. 

Because of these features of the present state of the subject we will 
refrain from presenting mathematical derivations. These can be found 
in the literature which will be cited below. We will content ourselves 
with presenting the statistical methodology and examples of applications 
10 economic phenomena and will discuss in some detail the meaning of 
the assumptions underlying the various methods and of the results 
achieved. We will also try to evaluate the relative merits of the various 
methods presented. It should be emphasized that the methods are still 
far from perfect. 

One problem which arises in this connection is the following: There 
are a number of stochastic models which we may use in analyzing economic 
time series. To give an example: Suppose that we have a stationary time 
series, 4 i.e., a series without a trend. Then we may, for instance, try to 
explain the oscillatory (irregularly periodic) movements in two ways: 

(1) There is an underlying “true” period with random fluctuations super¬ 
imposed (section 9.2), or (2) the series follows a linear stochastic difference 
equation or autoregressive scheme possibly with errors of observations 
superimposed (section 10.3). 

The choice between various models is very difficult. It may happen 


3 J. H. Smith: Statistical Deflation in the Analysis of Economic Series (dis¬ 
sertation, Chicago, 1941). H. T. Davis: The Analysis of Economic Time Series 
(Bloomington, Ind., 1941). 

1 W. Feller: op. c/7., pp. 328 rf. A. Wold: A Study in the Analysis of 
Stationary Time Series (Uppsala, 1938), pp. 32 ff. 



SOME TOPICS IN TIME SERIES ANALYSIS 


187 


that one model fits well and the others rather badly. But even in this 
case if we have short economic series we may ask ourselves to what 
extent this phenomenon is due to specification (i.e., the choice of che 
stochastic model). It may also be due to the particularities of the em¬ 
pirical sample which we analyze. Our data have always to be regarded 

as samples from a hypothetical infinite population which embraces ail 
possible economic relationships. 

There is as yet no valid procedure for the choice between various 
theories or hypotheses. This problem has been discussed in section 1.2. 
But it is to be hoped that eventually Carnap’s 5 theory of inductive inference 
will be able to deal with problems of this nature. As of today, the methods 
iiven oy Carnap are applicable only to the theory of attributes and hence 
cannot deal with problems which involve continuous random variables 
They are not yet applicable in the field of economic time series analysis. ’ 

iince the terminology in this field is not too well established, we want 
10 define here the use of some terms. The meaning of the various terms 
is not always the same in the literature. We will talk about periodic 
movements if there is a fixed period in the sense that the movement strictly 
repeats itself during every period. The seasonal movement (section 9 3) 
is an example of such a movement. Oscillatory movements will be 
observations which nave certain irregularly periodic features, i.e., where 
period and amplitude are not constant. An example of this type of 
movement is the business cycle (section 9.2). 

By autocorrelation we understand the iag correlation of a given series 

with itself, lagged by a number of time units (section 10.;). The corre/o- 

gram (section 10.6) is a graph of these autocorrelation coefficients. Bv 

sena! correlation (lag correlation) we understand the lag correlation 

between two different time series. Autocovariance and serial covariance 

are similarly defined. A stationary time series is a series which has no 

trend (section 10.6). The probabilities involved are invariant aoainst 
translations of the time axis. ° 

For the sake of simplicity our discussion will be mostly in terms of 
the sample. But it is the goal of the analysis to estimate relations which 

S-S* y eX ‘ St ln th l h yp° thetical infinite population from which one 
empirical series are taken as samples. The estimation procedures are 


Pp 5 7 R 2fF C —Th;,“° n ' ndUC,1Ve ' 0glC '" Philosophy of Science, vol. 12 (1945) 
Research \J “ nce P t5 °f probab,li ty,” Philosophy and Phenomenological 
j 950 ) c T ), pp. M3ff., Logical Foundations of Probability (Chicago 

! , r 1 nCr: “ F ° Undalions of probability and statistical inference ’’ 

rra of the Royal Statistical Society , vol. 1 12 (1949), pp. 252 rf. 



188 


SOME TOPICS IN TIME SERIES ANALYSIS 


still very unsatisfactory in this field. The same is true of tests of signifi¬ 
cance and tests of hypotheses. 

In the following chapters we will sometimes present non-pcirametrie 
procedures. 6 These methods are based upon the assumption that the 
form of the distribution functions in question is not specified. Rank 
correlation is a well-known example of such procedures (section 8.4). 
Non-parametric methods allow us to make exact tests, but under very 
restrictive assumptions (sections 8.4, 9.4, and 10.1.1). 

The next two chapters (Chapter 8 and Chapter 9) will deal with the 
analysis of the trend and with the investigation of oscillatory and periodic 
movements. In these chapters we will present some of the more tradi¬ 
tional methods of time series analysis. These methods are based upon 
the idea of the independence of the various components of a time series, 
trend, cycle, etc. These assumptions are hardly tenable, and the methods 
cannot give us more than very rough approximations. 

Chapter 10 deals with the interdependence of observations. Here we 
reach the most important problems of economic time series analysis. 
The mathematical problems involved are very difficult, and hence the 
results achieved in this field are frequently not yet satisfactory. But a 
certain amount of progress has been made. One important shortcoming 
is the almost complete lack of small sample theory. It would be most 
helpful to have some reliable small sample tests in this field, since most 
empirical economic time series are rather short. 

The last chapter, Chapter 11, deals with transformations of observations. 
By transforming observations we seek to make them amenable to the 
more traditional methods of statistical inference which are based upon 
the independence of consecutive observations. Some of these procedures 
have been described in Part 2 of this book. This type of method should 
of course be applied only if a better procedure is not available. We have, 
in general, to base our transformations upon information contained in 
our sample, which is frequently rather small with economic time series. 
Hence the transformations involved may be greatly influenced by the 
peculiar features of the sample, i.e., the empirical time series analyzed. 
This will make the whole procedure of transforming the observations 
rather doubtful. 

Much of what is contained in the following pages has been inspired 
by the excellent work of H. T. Davis, The Analysis of Economic Time 
Series. 1 The reader is referred to this book for a fuller treatment of 
many topics. 

6 H. Scheffer “Statistical inference in the non-parametric case, Annals of 
Mathematical Statistics , vol. 14 (1943), pp. 305 ff. 

7 H. T. Davis: op. cit. 



Chapter 8 


The Trend 


It has long been observed that many economic time series show a 

tendency to grow. There is a secular trend apparent in most empirical 

series. It is also possible that there may be periodic movements of lone 

duration, the so-called KondratiefT waves. 1 We will here deal only with 

movements which are not oscillatory or periodic. By an oscillatory 

movement we mean a movement which is irregularly periodic with 

changing period and amplitude. Time series without trend are called 
stationary. 

This excludes the seasonal variation, the business cycle, and also the 
long waves. All these are oscillatory or possibly periodic. They have 
either a more or a less fixed period, like the seasonal variation whose 
period is 12 months. Or alternatively they show oscillations, i.e., periodic 
movements of variable period and amplitude like the business cycle. 

In this chapter we deal only with the following methods for trend 
elimination: (a) orthogonal polynomials (section 8.1); (b) moving aver¬ 
ages (section 8.2); (c) Hotelling’s method of fitting the logistic (section 

8.3). Further we will give ( cl) a non-parametric test for the trend 
(section 8.4). 

The variety of methods available for trend elimination is somewhat 

embarrassing. The method of fitting orthogonal polynomials is quite 

popular in economic applications. Without doubt, this method will give 

valid results in many cases. But there is no reason to assume that “true” 

trends in economic time series are really polynomials. Hence the method 

should be used with some caution and certainly should not be applied 

for extrapolation. It is even doubtful whether the trend of economic 

series of some length can be validly approximated over the whole ranee 
by a polynomial of a low order. 

The method of moving averages has some advantages. It is very 
simple and can be easily applied. It does not make the assumption that 


cLu'n K °n!iTT; : The '° ng W3VeS economic Readings in Business 
Cycle Theory (Philadelphia, 1944), pp. 43 ff. 


189 



190 


THE TREND 


[8.1 


the trend can be represented over the whole extent of the series by a given 
function. We have only to assume that in a limited neighborhood (say, 
over 5 years) the trend behaves approximately like a polynomial, e.g.' 
like a straight line. This is a much less stringent assumption, and hence 
the method is much more flexible. This may be a desirable property for 
ihe statistical methods which are to be used in this held. 

Some economic time series are more or less closely connected with the 

development of population. In this case we might be justified in fitting 

a logistic trend. It has been observed that animal and also human 

populations rollow this law. This particular form of the trend is difficult 

L ° fit because of the fact that the parameters enter in a non-linear fashion. 

We describe below Hotelling’s very ingenious methods in achieving 
such a fit. 

All these methods assume that there is actually independence between 
the various components of an economic lime series, e.g., between the 
trend and the cycle. This is clearly not the case. Hence these procedures 
can be regarded as only very rough approximations. The author has 
indicated in his business cycle theory 2 that it is possible to “explain,” for 
mstance, both the trend and the cycle or cycles by a system of linear 
differential equations (see section 3.8). A similar point of view underlies 
also Schumpeter’s theory of business cycles. 3 

We would like to have methods which can deal at the same time with 
the evolutionary trend and the cycle or cycles. Stochastic difference 
equations or differential equations would offer a promising beginning. 
These ideas are discussed in sections 10.3 and 10.4. But time series 
analysis is still mostly restricted to stationary time series, i.e., series without 
secular trend. Hence we have first to remove the trend before starting 
to apply the more modern methods of time series analysis. For this 
reason it is still important to deal with trend elimination. 

An important application of trends for the transformation of observa¬ 
tions will be discussed in section 11.1. 

8.1 Orthogonal Polynomials 

The fitting of polynomial trends by the method of least squares (section 
5.1) is very much facilitated by the use of orthogonal polynomials. Let 


2 G. Tintner: “A ‘simple’ theory of business fluctuations,” Econometrica , 
vol. 10 (1942), pp. 317 ff.; “The ‘simple’ theory of business fluctuations: A 
tentative verification,” Review of Economic Statistics , vol. 26 (1944), pp. 148 ff. 

3 J. A. Schumpeter: The Theory of Economic Development (Cambridge, 
Mass., 1934); Business Cycles , vols. 1, 2 (New York, 1939). 



8.1] 


ORTHOGONAL POLYNOMIALS 


191 


the variable t designate time. Instead of fitting by the method of least 
squares an ordinary polynomial: 


( 1 ) 


Z t = a 0 +J_a i t i (t — 1,2, 

t=l 


N) 


where the a f are constants, it is more convenient to fit a polynomial of 
the form: 


( 2 ) 


Y t — A o + 2 A i'£'it (t = 1, 2, • • - TV) 

i = i 


where the A i are constants. This method is preferable if the t are equally 

spaced (e.g., years, months) and if tables of the polynomials are available. 
These polynomials are orthogonal, 1 i.e., 

^ 2 £'it£'jt = 0 (/' # j) 


t=i 


N is the number of observations. It is easy to compute polynomials of 
higher order, e g., of order p + 1 by just computing A' p+1 without having 
to recompute all other values of the coefficients A Y \ A 2 \ • • • A ' 

Let us now assume that our observed series is of the following form: 
14) 


— Y, + (/ = 1,2, 


N) 


The quantities e t are random variables. They are normally distributed 

Their mean is zero and their population variance is a constant ct 2 . Thev 
are also not autocorrelated: J 


(5) 

( 6 ) 
(7) 


Ee t 
Ee , 2 

Ee t e t+, 


= 0 


-2 


(t = 1, 2, • ■ ■ N) 

a ‘ 0=1,2,- - - N) 

"0 (j,/= 1,2, 


• • 


N) 


ditions^SUo m ° b T'l 3 thCOry WhiCh iS P articular 'y simple. But con¬ 
cerns (5) to (7) and the assumption of normality may not always hold 

r, rr ^t s r s - The e ’ ,iras,es be 

‘he d '«r[bution theory presented below will not hold 

m in. i! S® aSSUm P tions stated 'here are tests which enable us to deter- 
whether actually a polynomial of order p + 1 gives a more efficient 

L depre a e P o°f th" 0 " 1 ^ °‘ £ We W ° U ' d aCtUall > like determine 

in qStTon P 0 '^ 0 " 113 ' whlch represents most adequately the trend 

Th.s is a multip le choice problem, and ought to be treated in a suitable 

1940 pn' 2^8 V fT ° fEconomic Ti ™ Series (Bloomington. Ind 

.934): PP ' ^ a,S ° M ' SaSUl 7' T ™ d Analysis ofS.aiisiics (wLhffigton! 



192 


THE TREND 


[8.1 


fashion. 2 No valid theory exists, however (see section 1.2). Still, the 
analysis of variance test gives us an idea about the answer to the following 
question: Is the residual variance after fitting a polynomial of degree 
p -f 1 significantly smaller than the residual variance after fitting a 
polynomial of degree pi In this way we may form an idea about the 
degree of the polynomial to be fitted to a given time series, by adapting 
successively polynomials of degrees 1, 2, 3, • • • and making the analysis 
of variance test after each step. 

The actual process of fitting is as follows: 3 Assume that we have a 
time series of N equally spaced terms: X lt X 2 , • • • X N . The coefficient 
A 0 ' is given by: 



This is simply the arithmetic mean of the series of observations X t . 
The higher coefficients are given by: 

I X£ it 

(9) a; - (y=l, 2, •••) 

I <r 2 >, 


t= i 


X 


The values of for given N and also the sums T £' 2 it are tabulated. 4 

t= i 

The residual sum of squares after fitting a polynomial of degree p is: 


( 10 ) 


x v x 

2 


Sp = > A? - v a; • 2 x t ? u = 

»=i j=<> /=i 

S'-i-A,’ ZX,? PI (p= 1,2, •••) 


(= 1 


This is distributed like y 2 with N — p — I degrees of freedom, under the 
assumption that the deviations from the fitted trend follow a normal 
distribution and are independent. 

The difference of the sums of squares, S, (+1 — S„, is distributed inde¬ 
pendently from S p like y 2 with 1 degree of freedom. 


2 G. Tintner: “Foundations of probability theory and statistical inference. 
Journal of the Royal Statistical Society, vol. 1 12 (1949), pp. 252 fif. 

3 R. A. Fisher: Statistical Methods for Research Workers (10th ed., Edinburgh 
and London, 1946), pp. 147 ff., section 27. R. A. Fisher and F. Yates: Statis¬ 
tical Tables for Biological, Agricultural and Medical Research (Edinburgh, 1938). 
R. L. Anderson and E. E. Houseman: “Tables of orthogonal polynomial values 
extended to N = 104,” Agricultural Experiment Station , Iowa State o ege. 

Research Bulletin 297 (Ames, Iowa, 1942). 

4 R. L. Anderson and E. E. Houseman: op. cit. 



8.1] 


ORTHOGONAL POLYNOMIALS 


193 


Hence: 

01) F, 


(N-p-l X5,_ t - S v ) 


(P = 1, 2, • • •) 


follows the F-distribution with 1 and N — p— 1 degrees of freedom. 

The residual variance is: 

( 12 ) 

This variance provides an estimate of the population variance of the 
errors or deviations, e t , namely of a 2 defined in (6). 

This method will in some cases give useful trends in economic time 
series. We should, however, not forget the limitations of the procedures: 

(1) The deviations from the polynomial trend must be normally and 
independently distributed. This may not always be the case. We will 
give in sections 10.1 and 10.2 methods for actually testing the validity 
of these assumptions. We present in section 10.5 a modification of the 
method of least squares which gives valid estimates and does not assume 
normality and independence of the deviations. 

(2) Polynomial trends should never be used for extrapolation. They 
should be considered only crude approximations for the probably vastly 
more complicated “real” trends of our series. These-approximations 
hold only within the range of the data. The method of moving averages 
(section 8.2) and the variate difference method (section 11.2) make less 
stringent assumptions about the nature of the trends. 

(3) The analysis of variance tests are not really valid for the purpose 

of deciding the order of the polynomial which should be fitted. This is 

a multiple choice problem. Hence great caution is necessary, and we 

should not assume that the polynomial fitted on the basis of these tests 

actually represents adequately the “true” trend in all cases. The residual 

variance may also not be a good estimate of the variance of the random 
element. 

Polynomial trends may be used to transform the observations so that 

the transformed variables become non-autocorrelated and classical 

statistical methods can be used. This subject will be treated in section 
1 1 • 1 • 2 c 


E ™™ple 1. R. L. Anderson and E. E. Houseman have used the method 

of orthogonal polynomials to fit a polynomial trend to the series of 62 

annual sugar prices .» They decided to fit a fourth-degree polynomial 
using orthogonal polynomials. ^ ^ 

JTh e sum of th e products of the series multiplied with the values of the 
6 R. L. Anderson and E. E. Houseman: op. c/7., pp. 600 ff. 



194 


THE TREND 


[8.1 


orthogonal polynomials is given in Table 1, together with the sum of 
squares of the polynomials, taken from the tables. This table gives also 
the values of the coefficients A 0 ' computed with the help of formula (8) 
and the values of the coefficients A- (z = 1. 2, 3, 4) computed according 
to formula (9). 




TABLE 1 


Degree of 
Polynomial i 


Vt'2 

it 

a; 

0 

1,336 

62 

21.548 

1 

- 20,286 

79,422 

- 0.2554 

2 

72,775 

1,270,752 

0.05727 

3 

- 1,080,557 

139,238,112 

- 0.0077605 

4 

- 7,599,201 

103,639,568,032 

- 0.00007332 


We proceed now to the tests of significance. We give in Table 2 the 
sum of squares S v computed from formula (10). 


TABLE 2 

Sums of Squares of Deviations from Polynomial Regression 
Degree of Polynomial Sum of Squares of Deviations Degrees of Freedom 


P 


N — p — 

0 

25,250 

61 

1 

20,069 

60 

2 

15,901 

59 

3 

7,515 

58 

4 

6,958 

57 


For the sum of squares of the deviations from the linear trend we have 
from the same Table 1: S ± = S 0 — AiS(' 2 lt = 25,250 — (— 0.2554) 
(- 20,286) = 20,069, etc. 

Finally, we compute from formula (11) the variance ratios and test 
them with the help of Table 3 of Snedecor’s F. 

TABLE 3 
Variance Ratios 


Degree of Polynomial 

P 

1 

2 

3 

4 


Variance Ratio 

F. 

15.5f 

15.5 f 

65.7f 
4.6* 


* Significant at the 5 per cent level of significance, 
f Significant at the 5 and 1 per cent levels of significance. 



8.1] 


ORTHOGONAL POLYNOMIALS 


195 


from" thp V fit^° f instar J ce ’ for th * test of the sum of squares of the deviations 
from the fit of a polynomial of second degree (S',) comnared with tho 

dele '(I 1 *) rCS ^ f Qm fr0m ^ ^ ° f 3 P °'y n0m,al °f the ‘bird 

aegree (^ 3 ). f 3 = (58) (15,901 — 7515V75I5 = 7 th, , 

significant on the 1 per cent level for 1 and 58 degrees of freedom 3 "* 

If we want to test whether the addition of the fourth-degree term 

Dolvno heS l SlgnifiCantly thC reS ’ dUal variance after fitting a thfrd-degrec 

polynomial, we compute: F t = (57)(7515 - 6958V6958 == 4 f, tk- 

value is significant at (he 5 per cenl level for I and 57 degree, of freedlm 

p.f; n o a i, ,he ,rend “>■ be ->»- ^ - rou„ h .d e e g “ x: 


* u ^ v - u: > /z/ s 2t — 0 . 0077605 £' 3/ — 0 . 00007332 £' 4 , 

TZ tj-enef m l*™i. > rapid compuia- 

If this fit is valid, we may consider the quantity V — 6858/57 177 mr, 

rr: d jrj°r la,,2) “«->"»* * >*• p S mss 


Year 


1919 

171.5 

1920 

167.0 

1921 

164.5 

1922 

169.3 

1923 

179.4 

1924 

179.2 

1925 

172.6 

1926 

170.5 

1927 

168.6 

1928 

164.7 

1929 

163.0 

1930 

162.1 

1931 

160.2 

1932 

161.2 


TABLE 4 


£' 

s u 

s 2 1 


- 11 

77 

- 77 

- 10 

56 

- 35 

- 9 

37 

- 3 

- 8 

20 

20 

- 7 

5 

35 

- 6 

- 8 

43 

- 5 

- 19 

45 

- 4 

- 28 

42 

- 3 

- 35 

35 

- Z, 

- 40 

25 

- 1 

-43 

13 

0 

- 44 

0 

1 

- 43 

- 13 

2 

- 40 

- 25 


£' 

^ 41 

£' 

* 5 1 

1463 

- 209 

133 

76 

- 627 

171 

- 950 

152 

-955 

77 

- 747 

- 12 

- 417 

- 87 

- 42 

- 132 

315 

- 141 

605 

- 116 

793 

- 65 

858 

0 

793 

65 

605 

1 16 


r< 

165.8047 

169.455 8 

171.9261 

173.3501 

173.8586 

173.585C 

172.662 2 

171.2227 

169.3994 

167.3251 

165.1324 

162.9540 

160.9229 

159.1716 


196 THE TREND [8.1 


TABLE 4 —(cont'd) 


Year 


€’u 

£ 2 t 


f' 4 t 

f 5 1 

r< 

1933 

165.8 

3 

— 35 

- 35 

315 

141 

157.8330 

1934 

163.5 

4 

-28 

-42 

-42 

132 

157.0397 

1935 

146.7 

5 

- 19 

-45 

-417 

87 

156.9246 

1936 

160.2 

6 

-8 

-43 

- 747 

12 

157.6204 

1937 

156.8 

7 

5 

-35 

-955 

- 77 

159.2597 

1938 

156.8 

8 

20 

- 20 

-950 

- 152 

161.9755 

1939 

165.4 

9 

37 

3 

-627 

- 171 

165.9003; 

1940 

174.7 

10 

56 

35 

133 

-76 

171.1670, 

1941 

178.7 

11 

77 

77 

1463 

209 

177.9083, 


The sums of squares of the polynomials are given in Table 5. 


TABLE 5 

Sums of Squares of Orthogonal Polynomials (N = 23) 


Degree of Polynomial 

P 

0 

1 

2 

3 

4 

5 


Sum of Squares 

23 

1,012 

35,420 

32,890 

13,123,110 

340,860 


The sums of the products of our variable times the values of thfe 
orthogonal polynomials are given in Table 6. 


Degree of Polynomial 

P 

1 

2 

3 

4 

5 


TABLE 6 

Sum of Products 

- 383.6 
2,606.0 
4,366.0 
17,252.4 
17.8 


In Table 7 we present the constants Aj and the residual sums of squares 
S v together with the appropriate degrees of freedom. 



8.1] 


ORTHOGONAL POLYNOMIALS 


197 


TABLE 7 


Degree of 
Polynomial 

P 

0 

1 

2 

3 

4 

5 


Constant 

*p' 

166.1913 

-0.379051 

0.073574 

0.132745 

0.001314658 

0.000052220 


Residual Sum 
of Squares 
•Sp 

1369.538260 

1224.134296 
1032.400452 
452.835782 
430.154777 
430.153847 


Degrees of 
Freedom 
N — p — 1 
22 
21 
20 
19 
18 
17 


The value 5 in the second column of Table 7 have been computed by 
0 073574 etc ^ b*™’ ^ mStance> for P = 2 - A » = 2606.0/35,420 = 

Similarly, the values of the third column have been computed from 
formula (10). We have, for instance, for p = 3 , 5 = 5 _ a 'v v t f _ 

“d 2 ~ ( ° l 3 J 7 f . 45,(4366) = 452.835782. ’ This is distributed 'like 
* * lth 19 de g rees of freedom, as indicated in column four. 

mally we give in Table 8 the variance ratios (F) together with the 

variance a,™ ^ ^ ^ 5 and ' PCr CCnt levels ° f si g nificaf ice. The 
for ^ = T°r ""I 6 “ mpUted fr ° m formula <'*)- We have, for instance, 

452.835782 = 24 317* _ ‘ S3)( ’ 9)/ ‘ S ' 3 = ( 1032 -400452 - 452.835782)09)/ 


Degree of 
Polynomial 


TABLE 8 


Variance Ratio 

2.495 

3.714 

24.317 

0.949 

0.000036 


Permissible Variance Ratio 

5 per cent 1 per cent 

level of significance 

432 8.02 

4 * 35 8.10 

4 - 38 8.18 

441 8.28 

445 8.40 


The only variance ratio which is significant ic r 
^ 0 ^™^ ' ranSiti ° n fr ° m the P ara bolic to'the cubTc trend * We 
ThTe^adont- “ ° Ur ' me Si "~ ^ a " d * - not significant 

('4) Z = 166.1913 - 0.379051^, + 0.073574f' 2( + 0.132745^ 

The a e U s e idui| the trend 3re alS ° giVgn in the l3St Colum " °f Tabje 4. 

= 23 83M62 VananCe aCC ° rding t0 formula C2X K 3 = 452.835782/19 
•t appears that the trend of the time series representing the consumption 



198 


THE TREND 


[ 8.2 


of meat can be represented with reasonable accuracy by a cubic for the 
period analyzed. This must, however, be considered only a rough 
approximation to the true trend of the series, which is probably a more 
complicated function. There is also some reason to doubt the validity 
of the estimate for the variance of the random element. 


8.2 Moving Averages 

8.2.1 THE USE OF MOVING AVERAGES 

A method which has the advantage of great simplicity in its application 
is the method of moving averages. 1 In economics, we frequently have 
no reason to assume a specific functional form of the trend (polynomial, 
exponential, etc.). But if we have a secular trend ar.d on it superimposed 
a cyclical fluctuation with a given constant period (e.g., the seasonal with 
a period of 12 months) we may remove the periodic component by applying 
a moving average with constant length. The length of the moving 
average is the period of the cycle. The method of moving averages 
assumes also that there is a constant amplitude in the cyclical fluctuation. 
But no great error will be committed if the amplitude changes slowly 
(section 9.3). 

The length of the moving average is the period of the cyclical component 
(e.g., 12 months for the seasonal). The series which has been smoothed 
by this application of the moving average will frequently be free or almost 
free of this particular cyclical variation. These procedures will be dis¬ 
cussed in connection with the elimination of the seasonal variation 
(section 9.3). 

The situation is somewhat more complicated if we have a trend and a 
superimposed cyclical movement of varying period. This is indeed the 
case with the business cycle, which has no fixed period or amplitude. 
But in this case we may estimate the period of the cyclical movement by 
measuring the distances between maxima and between minima. This will 
give us the approximate length of the moving average for the midpoints 
between the maxima and the minima. For intermediate points we will 
use moving averages whose length has been determined by linear inter¬ 
polation. This procedure should give us with some luck reasonably good 
approximations to the “true" trend as it presumably exists in the infinite 
hypothetical population from which our time series is a sample. 

1 F. R. Macaulay: Smoothing of Time Series (New York, 1931). E. C. 
Rhodes: Smoothing , Tracts for Computers 6 (Cambridge, 1931). A. Hald. 
The Decomposition of a Series of Observations (Copenhagen, 1948). G. Tintner. 
Prices in the Trade Cycle (Vienna, 1935), pp. 22 ff. 



3.2] 


MOVING AVERAGES 


199 


For the sake of simplicity, we will frequently use moving averages 
with constant weights. Let our original time series be X,, X. 2 , ■ • ■ X 

I 6 " 65 /' 111 3 nlOVm S avera S e of constant weight and 
of length - m -f 1, then the smoothed series will be: 


/// 


( 1 ) 




y y 

^ A t+i 


i ■ - /// 

2m 


(/ = m , m -f 1, 


N — m) 


This is equivalent to the fitting of a straight line to 2m + I consecutive 
observations by the method of least squares. 

If we have a strictly periodic movement of period 2m + 1 we iiave: 

^ Taw + I !• >’<-<2m + l) + m + 1=0 (k = 0, I • • •) 

Hence the smoothed series y/ is: 


(3) 


y, = 


rn 

. 2 y, +i 

——-- =0 

2m 4 1 


since the sum is zero. 

This assumption of strict periodicity will, however, onlv rarely hold 
- r| ie with empirical time series. ' ' 

If Ihc period of |he cyclical mo.emem is even, we may eel a correctly 

tTS, K! of,he firK “ d 

sr r :vr sr - 

try to approximate i, by a polynomial ore, ,U, “he JS 

ihe we** b “ k 0 «"-' Method tables which provWe 

~ r s? "r 

is discussed in section I 1.2.4. - 3, 4, 5.* Th.s problem 

Denote the weights by w_ M , w_„ l+1 
hen we have m die notation of section 11.2.4: 


H ’-i. u 'o. 




^ “t = 11 ' , = e (/) 

1 ~ * Sumx 1 / 

The smoothed series will be: 


( / ~ 0 , 1 , 2 , • • • m) 


(5) x: =s-w + 2g n j'xx l+i + Xl _,) 

0 = m, m -f 1 • ■ . yv 




PP-lOOir. M G ^endnlf ’ . Me ' hod (BI °omington, Ind., 1940), 
1946). pp. 372 fT '' "" /WwW rW T «/*««//«. vol. 2 (London, 



200 


THE TREND 


[ 8.2 


This method is equivalent to fitting a polynomial of degree n to all 
2m + 1 consecutive observations by the method of least squares. It 
should be noted that we do not make the assumption here that the trend 
can be represented over the whole range of the data by a polynomial of 
the /?th degree. We make the much weaker assumption that over a 
limited range of the data, i.e., over all consecutive 2m -f 1 items, we can 
approximate the true trend by a polynomial of the «th degree. Since 
the more rigid assumption is in all probability not justified with economic 
data, there is reason to prefer the method of moving averages to the 
method of fitting polynomial trends, described in the previous section, 8.1. 

There are, however, some difficulties connected with the application of 
this method to empirical series. (1) In fitting a moving average of length 
2m + 1 we lose 2m items, m at the beginning of the series and m at the 
end. Since economic time series are frequently short, this loss may be 


important. (2) If we use the method of moving averages to eliminate 
the season, we have to neglect the influence of changing seasonal ampli¬ 
tudes. The amplitudes may especially be correlated with the business 
cycle. This impairs the validity of the results. (3) If we use the method 
for the elimination of the cycle, we have to determine the maxima and 
minima. Because of random fluctuations which are always present it 
may be very difficult to locate exactly the true maxima and minima, and 
hence we may choose the wrong length of the moving average. But if 
this is the case there may be errors in the resulting smoothed series. 

Example 1. We apply the method of moving averages to the effort 
to find the short cyclical movement (40-month cycle) 3 in the American 
wholesale price index of all commodities, 1890-1947. The base is 
1926 = 100. We will try to eliminate not only the trend but also the 
longer cycles. The moving average has been constructed by inspection 
of the graph, but reference has also been made to the business annals of 

Thorp and Mitchell 4 and Burns and Mitchell. 5 

We present in Table 1 the original series, the length of the moving 
average for the particular year, and the smoothed series. The moving 
averages which are actually established from the graph or from the 
business annals are denoted by asterisks. They are measured from 
maximum to maximum and from minimum to minimum. All other 
values are interpolated. It is sufficient to use linear interpolation for 


this purpose. 


3 J. A. Schumpeter: Business Cycles , vol. 1 (New York, 1939), ^ 

4 W. L. Thorp and W. C. Mitchell: Business Annals (New York, 1926)^ 

5 A. F. Bums and W. C. Mitchell: Measuring Business Cycles (New or 

1946), pp. 76 ff. 



8 . 2 ] 


MOVING AVERAGES 


201 


Year 

1890 

1891 

1892 

1893 

1894 

1895 

1896 

1897 

1898 

1899 

1900 

1901 

1902 

1903 

1904 

1905 

1906 

1907 

1908 

1909 

1910 

1911 

1912 

1913 

1914 

1915 

1916 

1917 

1918 

1919 

1920 

1921 

1922 

1923 

1924 

1925 

1926 

1927 

1928 

1929 

1930 


TABLE 1 

Wholesale Price Index of All Commodities 


Index 

56.2 

Length of Moving Average 

Smoothed Series 

55.8 

3* 

54.7 

52.2 

2 

53.4 

53.4 

2* 

51.7 

47.9 

2* 

49.5 

48.8 

2* 

48.0 

46.5 

3 

47.3 

46.6 

4 

48.0 

48.5 

5* 

50.0 

52.5 

4 

51.9 

56.1 

3 

54.5 

55.3 

2* 

56.4 

58.9 

3 

57.9 

59.6 

4* 

59.0 

59.7 

5* 

60.0 

60.1 

4 

61.0 

61.8 

3 

62.4 

65.2 

3 

63.3 

62.9 

3* 

65.2 

67.6 

3* 

67.0 

70.4 

3* 

67.6 

64.9 

3* 

68.1 

69.1 

3 

67.9 

69.8 

3* 

69.0 

68.1 

4 

71.2 

69.5 

5 

82.1 

85.5 

5* 

94.3 

117.5 

4 

109.6 

131.3 

3 

129.1 

138.6 

2* 

140.7 

154.4 

3 

130.2 

97.6 

3* 

116.2 

96.7 

3* 

98.3 

100.6 

2 

99.0 

98.1 

2* 

100.1 

103.5 

2* 

101.3 

100.0 

3* 

99.6 

95.4 

3* 

97.4 

96.7 

4 

95.2 

95.3 

5* 

89.4 

86.4 

6 

82.8 



202 


THE TREND 


[ 8.2 


Wholesale 


64.8 

65.9 

74.9 
80.0 
80.8 

86.3 
78.6 
77.1 
78.6 

87.3 
98.8 

103.1 
104.0 

105.8 

121.1 

151.8 


Length of Moving Average 

7 

8 

9* 

8 

7* 

7 

6 

6 

6 

6 * 

6 

6 * 


Price Index of All Commodities —(< cont'd ) 

Index 
73.0 


Year 

1931 

1932 

1933 

1934 

1935 

1936 

1937 

1938 

1939 

1940 

1941 

1942 

1943 

1944 

1945 

1946 

1947 


Smoothed Series 

80.3 
78.6 
78.6 
76.0 

75.9 
79.0 

79.3 
80.5 
83.0 

85.9 

89.4 

93.9 


When moving averages with even length occur, we include one-half of 
the extreme values, in order to achieve correct centering. 

For the smoothed value for 1891 we have, for instance, to use a moving 
average of length 3. Hence we have (56.2 4- 55.8 + 52.2)13 = 54.7. But 
for the smoothed value in the year 1892 we have to use moving average 
with two terms. We take [(72)(55.8) + 52.2 + (7 2 )(53.4)]/2 = 53.4, etc. 

I t should be noted that we lose one item in the beginning of the smoothed 
series and five items at the end. The loss of six items may be of some 
importance with such a short series. The loss of the terms near the end 
of the series is particularly deplorable. 

It is not easy to locate the actual maxima and minima accurately. It 
is quite possible that we may have made mistakes here. If this is the 
case, then the resulting smoothed series does not represent very well the 
trend and longer cycle. 

The application of the method of moving averages will be illustrated in 
connection with the analysis of the seasonal movement in section 9.3, 
Example 2. An illustration of the fitting of moving averages with non¬ 
constant weights is given in section 1 1.2.4, Example 1. 

8.2.2 SUCCESSIVE SMOOTHING BY MOVING AVERAGES 

If we first eliminate the season and then the cycle by this method we 
use successively two moving averages. 6 


6 G. Tintner, Prices in the Trade Cycle (Vienna, 1935), pp. 83 if. 



3.2 ] 


moving averages 


203 


,/* thC mOVing aVCrage ° f , len g' h 2 '« + » have the weights ,/ 

constant weights we have V. = n !. l' ng 3Verage haS 

We apply ,t to the time series: JT,. JT* • • • JT,. The result Ts 


( 1 ) 


X> *~ .2 n'ift+i (t = n+ l,n + 2 

t=-/l 1 ’ 


• • 


N-n) 


It should be noted that we lose n items at rh* iw • 

Of the smoothed series X\. beginning and the end 

Now we smooth the series V" k„ „ 

With weights: w" m , w” . / $ 3 " £W a y era S e °[ >ength 2m + 1 

resulting series of'smoothed values Ts"’ °’ * 1 " ‘ * ->■ The 


( 2 ) **,= f 


i = — m 


(t — m -f- n -j- 1 , m -f n -f- 2 , 


• • 


N—m — n) 


■pp“ ^ r,» TT « „ ight have 

V. 11 *■ +">+ 1 r h -f*. 

to our original series X ti we have: m+n_1 m +»- A PPtying this 


m + n 


(3) X ' 2 W iX,+i ('-™ + n+\.m + n + 2,---N-m-n) 


»= - m-n 


The weights w"\ are formed in the following way: 


(4) 


m -f n — i 

W . = V . ' . // 

j = o 


n umber ‘of m ov^ng^ave^a^eT f ° r ^ SUCCessive W^tion of any 

b l -erage° wT.TTe 

“f° f :f-en-term average 


W 


1 MT 0 = 3 /.-. 


3 3 


^ Vl5, hT, = = 2 / 


W 


nt 


- 1 


3 1! MOVING AVERAGES AND THE RANDOM ELEMENT 1 
Lei us assume ihai we have iwo siochaslic series , , 

**• • ‘ ' Vh- Their means are zero: " 2 ’ ' ' ‘ a "d 


(I) 


Ee, = Et]! = 0 


(' = I, 2, 


IV) 


KendVr^.,^"'^ W<? ^ *’«>. pp. 84 tf. M. G 



204 


THE TREND 


[ 8.2 


The autocovariances are given by: 

(2) Ee t e t+ , = Es t e,_ a = A s (s = 1, 2, • • •) 

(3) Erj t r) l+ , = E VtVt~, = B s (s = 1, 2, • • •) 

The autocovariance is the covariance of the series with itself, lagged by a 
certain time interval 

The serial covariance of e t and rj t+8 is given by the formula: 

( 4 ) Ee tVt+s =C s (s= 1,2, •••) 

The serial covariance of two series is the covariance of one with the other 
lagged by the time interval s. 

Then we form the smoothed series e t and r\ t by applying a moving 
average of length In + 1 to e t and another of length 2m + 1 to rj t : 

n 

(5) e' t = 2 n'iCt+i (/ = n + 1, n + 2, • • ■ N-n) 

i = — n 
m 

(6) n' t = 2 w ”iVt+{ (/ = m + 1, m + 2, • • • N — m) 

i = — m 

The autocovariance of the smoothed series s' t and e t+9 is now: 

< 7 > Ee' t e' t+ ,= 2 2 

t' = — n jy= — n 

The autocovariance between ?/ t and r)' t + a becomes: 

m m 

( 8 ) Er,' t r,' t+ .= 2 2 

i = — m j= — m 

Finally the serial covariance between items of the two smoothed series 
e' t and rj' t+p is: 

n m 

(9) Ee’ t r)' t+ , = 2 2 M'',w" i C, w+J 

i =—n J = — m 

Now let e t be not autocorrelated, i.e., a random series. Then we have 
A s = 0, for s ^ 0. The variance of e t is A 0 . We have from (7) for 
this case: 

(10) Ee'\ = A 0 Z >v' 2 t . 

i = —n 

This is the variance in the smoothed series e t . The autocovariance is 
given by: 

2n + l—* 

(11) Ee' t e' t+S = A 0 2 i w> i+* 

j=o 

This follows from formula (7). This result shows that the moving average 



8.2} 


MOVING AVERAGES 


205 


introduces autocorrelations into the smoothed series e t for all 5 * < 2n -f 1 
even if the original series e t was not autocorrelated (section 10 . 6 . 1 ). 

The autocorrelation coefficient is: 



r. = 


Ee' t e' 


t+s 


Ff ' 2 

a-, c ^ 


2n+ 1-8 

1 


i+s 


J = 0 


71 


2 *' 2 , 
j = — n 


It is again apparent that = 0 for j > In + 1 . Hence the correlogram, 
which is the graph of the autocorrelation coefficients, must be zero for 
a lag or lead of s > 2n + 1. 

This need, however, not be the case with an empirical correlogram 
from an economic time series. Bartlett 8 has shown that the autocorrela¬ 
tion coefficients r L are themselves highly autocorrelated. Hence the 
appearance of an empirical correlogram may depend more upon the 
actual sample than upon the underlying stochastic scheme. 

Next define e t as before as a random series. Let also r) t be another 
random series so that B, = 0 for 5 # 0. Similarly, we have no serial 
covariance for s/ 0: C, — 0, if s ^ 0. But the covariance between 
contemporaneous values of e t and tj, is C 0 . Then, if n < m, we have for 
the covariance of e , and r/', the following relationship from formula ( 9 ): 

< 13 > Ee' t r,’ t = C 0 2 *■>",. 

1 = —71 

From the same formula we have for the serial covariance of e' and 
( 14 > = C„ 2* w'.wV, 

i = — 7i 

The fact that the application of moving averages introduces auto¬ 
correlations and serial correlations into pure random series is not sur¬ 
prising. Indeed, it has been demonstrated that the following fact is true- 
If a random series is repeatedly summed, then the resulting series tends 

in the limit to a pure periodic function with a period equal to the length 
of the original series . 9 6 


M. S. Bartlett: “On the theoretical specification of sampling properties 

of autocorrelated time series,” Journal of the Royal Statistical Society vol' 8 
(1946), supplement, pp. 128 ff. 

9 E. Slutzky: “The summation of random causes as a source of cvclical 
processes,” Econometrica , vol. 5 (1937), pp. 105 ff. A. Wald- “Long cycles 

* ^ esult of re peated integration,” American Mathematical Monthiv vol 46 

0939), pp. ,36 ff. See also H. T. Davis: Analysis of Economic Time Series 
(Bloomington. Ind., 1941), pp. 159 ff. 



206 


THE TREND 


[ 8.2 


The ideas presented in this section are of some importance. They 
show us what happens to the random element of a time series after it has 
been smoothed by moving averages. 

Some of the formulae given will be used in section 10.6.1, where we 
deal with specific stochastic schemes of the type of moving averages. 

Example 1. Suppose a random series e l9 s 2 , • • • s v is not autocorre- 
lated and has mean zero and variance a 2 . Then it is smoothed with a 
five-term moving average with the following weights: w_ 2 = — 0.086, 
w’i_ = 0.343, w 0 = 0.486, w x = 0.343, vv 2 = — 0.086. The sum of the 
weights is 1. This is the moving average which corresponds to the fitting 
of a second-degree parabola to five consecutive points of observation . 10 
The smoothed series is: 

(15) e' t = - 0.086s,_ 2 + 0.343s,+ 0.486s, + 0.343s, +l - 0.086s, +2 

a = 2, 3, • • • /V- 2) 

The variance of the series s', is evidently [formula (7)]: 

(16) Ee' 2 t = cr 2 [( 0.086) 2 + (0.343 ) 2 + (0.486) 2 + (0.343 ) 2 + (- 0.086) 2 ] 

= 0.486a 2 


The autocovariances of the smoothed series are [also from formula (7)]: 

(17) £s',s', +1 = a 2 [(- 0.086)(0.343) -f- (0.343)(0.486) + (0.486)(0.343) + 

(0.343)(- 0.086)] = 0.274a 2 

(18) £s',s', +2 = a 2 [(— 0.086)(0.486) + (0.343) 2 - r - (0.486)(- 0.086)] 

= 0.034a 2 

(19) Ee' t E t + 3 = a 2 [(— 0.086)(0.343) -f (0.343)(- 0.086)] = - 0.058a 2 

(20) Ee t e M = a 2 [(- 0.086)(- 0.086)] - 0.007a 2 


Members of the smoothed series s', which are distant by five or more 
units are not correlated. 

Let further r] t be a not autocorrelated random series with mean zero 
and variance r 2 . It is smoothed by a simple three-term moving average 
with constant weight l / 3 : 

(21) - 0.333>/,_ 1 + 0.333//, -f 0.333 r )nl (t = 2, 3, • * * -V - 1) 

Assume the covariance between the original series s, and //, to be y : 



£s,//, = 


v 


10 G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940), 
pp. 108 ff. 


8.2] 


MOVING AVERAGES 


207 


There is, however, no serial correlation between the two random series. 
There is independence between r t and tj s if t -^= s: 

(23) Ee t rj s = 0 (/ ^ s) 

We have now for the covariance of the smoothed series [from formula 

(13)]: 

(24) Ei\rf t = [(0.343)(0.333) + (0.486)(0.333) + (0.343)(0.333)]y 

= 0.390/ 

On the other hand, we obtain for the serial covariances of the two 
series from formula (14): 

(25) Eerff^ - [(0.486)(0.333) + (0.343)(0.333) 4- (- 0.086)(0.333)]y 

- 0.247/ 

This is the serial covariance for lag 1. We have for las 2: 

(26) Ee' t ri ' l+2 - [(0.343)(0.333) (- 0.086)(0.333)]y = 0.086/ 

Finally, the serial covariance for lag 3 is: 

(27) £V>/' <+ 3 = (- 0.086)(0.333)/ = - 0.029/ 

Items of the smoothed series which are distant by L = 4 or more items 
are independent. 

This example shows that the smoothing of non-autocorrelated time 
series introduces autocorrelations into the smoothed series. Even if there 

was no serial correlation between the original series the smoothed series 
w'ill be serially correlated. 

8.2.4 THE EFFECT OF MOVING AVERAGES ON THE AMPLITUDES 
OF PERIODIC MOVEMENTS 11 

What is the effect of the application of a moving average w ith constant 
weights of length 2n / 1 on the amplitude of a cyclical movement of 
period P? This is a problem of some importance in economic statistics. 
We may, lor instance, use moving averages to remove the seasonal (section 
9.3). What is their effect on the amplitude of the cycle? 

A well-known formula for the sum of sines and cosines 12 shows that 


G. limner: Prices in the Trade Cycle (Vienna, 1935), p 85 MG 
Kendall: op. cit ., p. 380. 


12 


C. Jordan: ( alcu/ns of Finite Differences (2nd ed., New' York, 1947), p. 104. 



208 


THE TREND 



the amplitude of the cyclical movement of period P is diminished in the 
proportion: 

(2 n + 1)180 1 


( 1 ) 



(2 n + 1) sin 


180 


This formula assumes of course constant amplitudes and periods for 
both cyclical movements, e.g., the season and the business cycle. They 
must in fact be pure sinusoidal movements. This assumption will only 
rarely be fulfilled with economic series. Hence the formula for the 
reduction of the amplitude by the application of moving averages will 
give only very rough approximations to the reduction of the “true” 

amplitude. 

Example 1. Let us assume that the true period of the cyclical move¬ 
ment is 30 time units. If we use a moving average of extent In + 1 = 3, 
then we will diminish the amplitude of the cycle in the proportion: 




sin 18° 
3 sin 6° 


0.9854 


This is to say, the series smoothed by the three-term moving average will 

have an amplitude diminished by about 1.5 per cent. 

This figure has to be considered only a pretty rough approximation if 
we deal with empirical time series. They are not likely to have constant 
periods and amplitudes in their cyclical movements. 


8.3 Fitting the Logistic 

It has been shown in many biological investigations that animal popu¬ 
lations follow under certain conditions the logistic law of growth: 

(i) no = (t = i, 2 , • • • N) 

where a, b, k are constants and t is time. The property which recommends 
this law as compared with polynomial or exponential trends is the fact 

1 R Pearl' Studies in Human Biology (Baltimore, 1924), Chapter 24. A. J. 
Lotka: Elements of Physical Biology (Baltimore, 1925), Chapter 7 V. Volterrai: 
Lecons sur la theorie mathematique de la lutte pour la vie (Paris, 1931). 
Bertalanffy: Allgemeine Biologie (Berlin, 1942), pp. 327 ff. 


8.3 ] 


FITTING THE LOGISTIC 


209 


that it has an upper asymptote: This asymptote is k. It is likely that 
human populations also may sometimes follow approximately this law 
or one of its modifications. 2 Hence we may suspect that many economic 
time series which are closely related to the development of population 
may have logistic trends more or less of the form (l). 3 

The constants in equation (1) enter in a non-linear fashion. Hence the 
application of the method of least squares is difficult (section 5.1). As 
Hotelling 4 has shown, however, this difficulty can be overcome by con¬ 
sidering the differential equation of (1). This idea is suggestive of the 
methods of stochastic differential equations which will be discussed in 
section 10.4. 

Differentiating (1) with respect to /, we obtain: 



dY 1 a 

dt ~Y k 


If the interval is not too large the quantity (d Yjdt)( 1 / Y) may be approxi¬ 
mated by: 

A Y 

(3) R(t) = — 

where A Y is the first difference of Y, i.e., Y t+l — Y t . Better approxima¬ 
tions which involve higher differences are also available. 5 

Suppose that we have a set of observations Y ly K 2 , • • • Y y for the 
equidistant points in time 1, 2, • • • N. Then we want to fit by the 
classical method of least squares the relationship: 

(4) R t =p + qY t + e t (t = 1, 2, • • • N) 

where the e t are errors or deviations. They must have zero mean and 
constant variance and cannot be autocorrelated. 

The least squares estimates p and q may be used for estimating the 
parameters a and k : 

(5) p — a 

2 T. Davis: Analysis of Economic Time Series (Bloomington, Ind., 1941), 

pp. 17 ff., pp. 247 flf. E. C. Rhodes: “Population mathematics III,” Journal of 
the Royal Statistical Society , vol, 103 (1940), pp. 362 ff. 

S. S. Kuznets: Secular Movements of Production and Prices (Boston, 1930). 

4 H. Hotelling: “Differential equations subject to error and population esti¬ 
mates, “ Journal of the American Statistical Association , vol. 22 (1927), pp. 283 ff. 
E. T. Whittaker and G. Robinson: The Calculus of Observations (London 

1924), pp. 62 ff. 



2IC THE TREND [8J 

The constant k is the upper asymptote of the trend. It is the highest 
value the logistic can assume as t becomes infinite. 

(6) k = 

In order to estimate b we make the assumption that the best fit goes 
through the arithmetic means of Y and t. Here t is counted from l to N. 

We have, according to Rhodes: 

(7) l° gc {y ( ~ 1) = ^ b ~ at 

Hence b can be computed from: 

(8) log, i .flog, (£- 1 ) 

A different method for determining the constant b is given by H. T. 
Davis. 6 

The fitting of the logistic is much more complicated computationally 
than the fitting of polynomial trends (section 8.1). Nevertheless this 
type of trend is much to be preferred. The main reason is the existence 
of the upper asymptote. There is, at least in some cases, good reason 
to believe that the “true" trend of economic series is a logistic, if the 
series is connected with population. This is certainly the case for many 
empirical economic series. 

It is, however, possible that the “true" trends of some series are more 
complicated than the logistic. 

Example 1. H. T. Davis gives 7 the following example of fitting a 
logistic by the Hotelling method: The data are the annual figures of the 
Standard Statistics Index of Industrial Production for the 54 years 1884— 
1937. 

The least squares estimates of p and q are p = 0.1563555 and q = 

— 0.00202886. Hence we have immediately from (5) and (6): a = p = 
0.1563555 and k — — piq = 77.06564. The last figure is the upper 

asymptote of the trend. 

We estimate b from formula (8). The mean of Y is Y = 52.80555. 
Hence we have from this formula: 


6 H. T. Davis: op. cir., p. 251. 

7 H. T. Davis: op. cir., pp. 252 ff. 


8 . 4 ] 


A NON-PARAMETRIC TEST FOR THE TREND 


211 


( 9 ) log, b = 3.670496 

The complete equation of the logistic is: 




77.06564 

1 -|- 39.27 • £- 0 . 1563555 * 


The equation represents the trend of industrial production. Its validity 
is somewhat doubtful because of the low value of the upper asymptote. 

Example 2. We want to fit a logistic trend to the yearly series of the 
United States wholesale price index , 1890-1947. There are N = 58 yearly 
items. The mean of the series is Y = 80.286. 

We construct the series R, by dividing the first differences of Y t by 
the values of Y t . Then we have from formula (4): 


0 0 R ( = 0.040852 - 0.000226117 Y, 

This is the result of an application of the classical least squares analysis 
We have from (5): 



a = 0.040852 


and from (6): 

03) 


- 0.040852 
- 0.000226117 


180.669226 


This is the upper asymptote of the trend of the wholesale price index. 
We have also from (8): 

< 14 ) log, b = 1.429539 


The complete logistic trend is finally: 



180.669226 

1 + (4.1768)e _0 040852 ‘ 


Again, we may have doubts whether this formula represents the trend 
of wholesale prices adequately. 

8.4 A Non-parametric Test for the Trend 

In some cases of economic phenomena closely related to population 
we may assume that the trend will be a logistic curve. But in most 
problems we have no particular theoretical reason for preferring one 
form of the trend to the other. The trend may be an exponential a 
logistic, a polynomial of any degree, etc. Hence it is important to have 



2/2 


THE TREND 


[8.4 


a non-parametric 1 test for the trend. There is now no assumption of a 
normal distribution. This test does not assume anything about the 

a P riori form of the trend (logistic, straight line, parabola, cubic, etc., 
exponential, or any other form) but is simply based upon all possible 
combinations of the actually observed values. It is a test for randomness. 
It reduces these values to ranks. Evidently the ranked values of any 
monotonically increasing or decreasing function are the same. The test 
neglects some of the available information. Hence it may not always 
have maximum efficiency. 

This test for trend has been suggested by H. B. Mann. 2 It is identical 
with the theory of a rank correlation coefficient introduced by M. G. 
Kendall, 3 whose exposition we propose to follow. 

Consider the time series given in the form: X lf X 2 , • • • X v> where X l 
is the observed value in the first time unit (year, month), X 2 the observed 
value in the second unit, • • • X s the observed value in the last unit. 
Now we replace the values of the observed series by their ranks : We 
order the series X lt X 2 , • • • X v according to magnitude, so that we 
start with the smallest and end with the largest value. In terms of ranks 
our series becomes now: p x , p 2 , • • • p Xt where p x is the rank of X v p 2 
the rank of X 2 , • • • p x the rank of X s . 

We want to compute a coefficient of disarray. 4 This coefficient should 
be larger than — 1 and smaller than 1. The coefficient is 1 for perfect 
agreement, i.e., if the series of ranks p ly p 2 , • • • p y is the series of positive 
integers 1, 2, • • • A/. It is — 1 for perfect disagreement, i.e., the series 
N, /V — 1, /V — 2, • • • 2, 1, the series of positive integers in reverse 
order. In general it is defined as: 

0 ) T = — 25 

N( N - 1) 

where S is called the total score. It is most conveniently computed from 
the number of positive scores P: 


( 2 ) 


5 = 2P- 1) 


1 H. Scheffe: “Statistical inference in the non-parametric case," Annals o] 
Mathematical Statistics , vol. 14 (1943), pp. 305 ff. 

2 H. B. Mann: “Non-parametric tests against trends," Econometrica, vol. 
13 (1945), pp. 246 if. 

3 M. G. Kendall: Rank Correlation Methods (London, 1948); see also “A 
new measure of rank correlation," Biometrika , vol. 30 (1938), pp. 81 ff. H. E. 
Daniels: “The relation between measures of correlation in the universe of 
sample permutations," ibid., vol. 33 (1944), pp. 129 ff. 

4 M. G. Kendall: Rank Correlation Methods (London, 1948), pp. 3 ff. 



8.4] 


A NON-PARAMETRIC TEST FOR THE TREND 


213 


The positive scores P are computed in the following way: Consider 
first the rank of the first element, Pl . Call n l the number of ranks of the 
elements X 2 , X 3 , ■ ■ ■ X N which are larger than Pl . Then take the rank 
of the second element, p 2 . Call n 2 the number of ranks of the items 

^ 3 . X t ■ ■ ■ X x which are larger than p 2 , etc. In general for each item 
with rank we call n ( the number of elements X i+l , X • • • X 
which have higher rank than Pi . Then P is simply: ’ + ’ ’ +2 ’ N 


(3) 


P ~ n \ + n 2 -(- • • • -f n N _ 


To test the significance of an observed 5 we may use the tables provided 

by Kendall- for N = 4 to N = 10 . For larger N we note that the dis- 

tnbution of 5 converges rapidly to normality.'' Hence for N larger than 

10 the quantity 5 is distributed approximately normally with mean zero 
and variance: 


(4) 


NiN^XlN + 5 ) 

18 


We have, however, to make a correction for continuity. For a positive 
5 we subtract I, and for a negative S we have to add 1 , in order to be 
able to use the continuous normal distribution. The score S’ is evidently 
discontinuous. A theory of tied ranks has been given by Kendall.' * 

raw T'n L , We WlM fifSt investi S ate a sh o” series for trend. We 
take the Dow-Jones industrial stock prices yearly averages 1906-13 which 

are given in H. T. Davis' books.* The ranks are also given in Table I 


TABLE 1 


Year 

Industrial Stock Price Average 

t 

Xi 

1906 

93.88 

1907 

74.91 

1908 

75.55 

1909 

92.77 

1910 

84.27 

1911 

82.37 

1912 

88.71 

1913 

79.20 

5 Ibid., p. 14|. 


6 'bid., pp. 38 ff. 


7 Ibid., pp. 25 ff. 



T - Dav,s: Analysis of Economic Time 

P* /. 


Scries 


Rank Positive Score 


Pt 

8 

1 

2 

7 

5 
4 

6 
3 


0 

6 

5 

0 

1 

1 

0 


(Bloomington, Ind., 1941), 



214 


THE TREND 


[8.4 


It is evident that the smallest value is 74.91; hence it gets the rank 1. 
The next largest value is 75.55, which receives the rank 2, etc. The 
largest observation, 93.88, receives the rank 8. 

We proceed now to compute the total number of positive scores P. 
We have for the first value the rank 8; it receives a positive score n ± = 0, 
since there is no item with larger rank below it. The next item has rank 1. 
It receives the score n 2 = 6, since there are six observations with larger 
ranks below it, namely 2, 7, 5, 4, 6, 3. The third item has rank 2. It 
receives the positive score n 3 = 5, since there are five observations with 
larger ranks below, namely 7, 5, 4, 6, 3. The next item has rank 7. It 
receives the score zero, since there is no item with larger rank below. 
The fifth observation with rank 5 receives the score 1, since there is just 
one item with larger rank below, namely 6. The next item has rank 4 
and receives the score 1. since there is one item with larger rank below, 
namely 6. The seventh observation finally has rank 6 and has the score 
zero, since there is no item with larger rank below. We have hence: 

(5) P = 0 + 6 + 5 + 0+ l + l+ 0=13 


From formula (2) we compute the total score: 

(6) S = (2)( 13) — (£)(8)(7) = — 2 


From formula (1) we obtain the rank correlation coefficient which 
measures the extent of disarray: 



_ ( 2 )(- 2 ) 
T (8)(7) 


= - 0.071 


This is a small negative value, indicating the possibility of a slight 
negative trend in the series of stock prices. 

In order to test the significance of 5 we enter the tables provided by 
M. G. Kendall for TV = 8, 5 = 2 (the S-distribution is symmetric; hence 
a probability of positive 5 is the same as a probability for negative 5). 
The probability of obtaining S = — 2, or smaller, is 0.452. This is a 
value which cannot be considered significant from the point of view of a 
5 per cent significance level. Hence we may conclude: It is very probable 

that the series of eight stock prices shows no trend. 

Example 2. We want to make a non-parametric test for the trend of 
the 10 years of the United States wholesale price index , 1938-47. The 
data are given in Table 2. 

It will be noted that the items are almost in perfect order. This is o 
course due to the war and postwar inflation. The only exception is the 

index for 1939. 


8.4] 


A NON-PARAMETRIG TE<T FOR THE TREND 


215 


TABLE 2 

Wholesale Price Index 


Year 

Index 

1938 

78.6 

1939 

77.1 

1940 

78.7 

1941 

87.3 

1942 

98.8 

1943 

103.1 

1944 

104.2 

1945 

105.8 

1946 

121.1 

1947 

151.8 


Sum 


Rank 

2 

I 

3 

4 

5 

6 

7 

8 
9 

10 


Positive Score 

8 

8 

7 

6 

5 

4 

3 

2 

1 

P = 44 


Hence the total score 


We have the sum of positive scores P = 44. 

5 = (2) (44) — (i)(10)(9) = 43 [formula (2)]. 

< 1 J he Thk k ; C ° rrelation co ® fficient is r = (2)(43)/(10)(9) = 0.956 [formula 
( )]. This indicates a high positive correlation. 

IndeecI, f° r N = 10, the tables provided by Kendall give for S = 43 
a probabthty of only 0.0000028. Hence the result differs significant!v 

rom wh t would expec[ from a pure random senes > 

certain that there is a trend in the index series. 

scored a c° is Want ‘° US£ large sam P le test - The " the total number of 
cores S is approximately normally distributed with mean zero and 
variance (10)( 9 )(25)/ , 8 = l25 [formula (4)] From 

ution we see that the corresponding probability is less than 0 0001 

a trenV irf th^data.^^ ^ C ° nC,Usion that there is all probability 



Chapter 9 


Oscillatory and Periodic Movements 


We have at least two types of oscillatory movements in economic time 
series: the-seasonal variation with a fixed period and the business cycle 
with a variable period and amplitude. There is also the possibility of 
other cycles of longer period, e.g., the so-called Kondratieff waves. 1 

There is an extensive economic literature on business cycles. 2 The 
subject is still much disputed, and it is not easy to see how purely theoretical 
arguments, together with somewhat casual historical investigations, can 
settle the problem. Econometric methods may possibly be more suc¬ 
cessful. They may eventually throw some light upon the relative merits 
of the various business cycle theories which have been proposed by a 
number of economists or even suggest new and more promising ap¬ 
proaches. A beginning of this type of investigation has been made in 
Tinbergen's 3 books on business cycles. 

In his monumental study of business cycles Professor J. A. Schumpeter 4 
distinguishes three major types of oscillations: Kondratieff cycles with 
a period of somewhat less than 60 years, Juglar cycles with a period of 
less than 10 years, and Kitchin cycles with a period of about 40 months. 

Two statistical methods which seem to have some merit in the investi¬ 
gation of these difficult problems are the following: (a) Fourier analysis 
(section 9.1) is fundamental for many of the methods proposed in this 
field. It is applicable only if there are known periods of constant length. 
(b) Periodogram analysis (section 9.2) seeks to find “hidden" periodicities. 
Since there is no constant period with the business cycle, this method 


1 N. D. Kondratieff: “The long waves in economic life," Review of Economic 
Statistics , vol. 17 (1935), pp. 105 ff. 

2 G. von Haberler: Prosperity and Depression (Geneva, 1939). J. A. Schum¬ 
peter: Business Cycles (New York, 1939). W. C. Mitchell: Business Cycles. 
The Problem and Its Setting (New York, 1927). A. F. Burns and W. C. 
Mitchell: Measuring Business Cycles (New York, 1946). 

3 J.-Tinbergen: Business Cycles in the United States , 1919-32 (Geneva, 1939), 
The Dynamics of Business Cycles (Chicago, 1950). 

4 J. A. Schumpeter: Business Cycles, vol. 1 (New York, 1939), pp. 161 ff. 

2/6 


9.1] 


FOURIER ANALYSIS 


217 


may prove important in some applications. But it should be remembered 
that all we can discover is some kind of average period or periods. 

Of the many methods proposed for eliminating seasonal variations 
we choose the one of Wald (section 9.3) because it is simple and also 
because it seems to be based upon very reasonable economic assumptions. 

Oscillatory movements in time series have sometimes a very irregular 
appearance. Hence it is advantageous to have a non-parametric test 
of cycles (section 9.4) which is independent of any specific assumptions 
about the nature of the cycle, its period, its amplitude, etc. Such a test 
is presented in the last section of this chapter. 

All the methods dealt with in this chapter assume that the various 
components of economic time series, say trend, business cycle, and season, 
are additive and independent of each other. This is probably not trueA 
Hence the results should be interpreted with some caution. 

It should also be emphasized that periodogram analysis (section 9.2) 

is probably not adequate for the treatment of the business cycle. It 

seems more reasonable to assume that the cycle results from a stochastic 

difference equation or autoregressive stochastic scheme, than that there 

is a hidden periodicity. Hence the methods proposed in the next chapter 

will probably be more adequate to deal with this problem. (See especially 

sect,on 10.3.) The method of periodogram analysis has, however, the 

advantage of greater simplicity. There are also small sample tesis of 

significance available which are still missing for the more complicated 
methods which will be discussed below. 

9.1 Fourier Analysis 

If the empirical time series T„ 4f 2 , • • - X s has a period of length T 
\ve may represent it by a Fourier series. For instance, monthly data are 
' <- y to have a period of 12 months (seasonal), quarterly data probably 
have a period of 4 quarters, etc. The strict periodicity of the business 
cycle is more doubtful. * 1 It ,s not likely that we have here a fixed period. 

I he representation of the series X Y , X 2> • • • X x will be: 2 

• » 


(I) 


Y , 


o 


'/’ 2 
4- y 

i 


Aj cos 


360/7 


T 


B sin 


360/7 


T 


) 


,,,4°; " A '” n0pl '’ "' co,y o! b “”“ «™. voi. 10 

1 7 m,ner: Prices "< 'he Trade Cycle (Vienna, 1935) 

1924) pp,«)";; ker n nd n G ‘ ROb ;y n: The Ca,n,lus °f Observations (London, 
1931) pp ,79 / u T n ■ t, C °'" bma "°" “F Observations (Cambridge, 
ington.Tnd., 1941, pp ^ T "" e S "ies (Bloom- 



218 


OSCILLATORY AND PERIODIC MOVEMENTS 


[9.1 


Another equivalent representation is: 

V 1 4 , p /360/7 \ 

Y t = 2 A ° R > cos (-j- - 

The expression: 


( 2 ) 


(3) 


R 2 = /l, 2 + B 2 


is the squared amplitude and 

(4) 


a, — arc tan 




is the phase angle. 

The terms in the series are orthogonal (section 8.1). The fit is accom¬ 
plished by the method of least squares. This assumes that there is a 
random component superimposed upon the periodic movement. The 
elements of this random component must have mean zero and constant 
variance, and they must also be mutually independent, ft is somewhat 
doubtful whether this last assumption is ever fulfilled with economic 
series. The method of least squares is not applicable without modification 
to series with autocorrelated errors or deviations. This problem is 
discussed in section 10.5. 

We minimize the sum of squares: 

(5) 0 = 2 (X, - Y,) 2 


(= 1 


The constant A 0 is given by: 


( 6 ) 


Y 

° ffi N 


This is simply the mean of the whole series. 

The other constants are obtained by computing: 




(7) 


B: = 


2 [,#, * - m 

N 


In practice it is more convenient to group the data as shown in Table 1 
for the investigation of a given period P (mP is N or the nearest integer 
below A). 3 

3 M. G. Kendall: Contributions to the Study of Oscillatory Time Senes 
(Cambridge, 1946), p. 4. 



9 . 1 ] 


FOURIER ANALYSIS 


219 



TABLE 1 



Xp + 2 

. . . y 

Ap 

• • • Y 

X(m-\)P + \ 

^(m-l)P + 2 

• • • Y 

* mP 

Sums U l 

t/ 2 

• * * Up 


The constants A r and B P are computed according to the formula: 



A P 


i~(7) 


B r 


mP 


mP 


It should be recalled that any given set of numbers can be represented 

V orthogonal functions, e.g., orthogonal polynomials, or by a Fourier 

series. Hence not too much importance can be attached to a Fourier 

series unless it can be justified on a priori grounds because we know that 

there is a strictly periodic movement. Tests of significance will be 
discussed in section 9.2. 


Period 

J 

5 

6 

7 

8 
9 

10 
I I 
12 

13 

14 

15 

16 

17 

18 
19 


TABLE 2 
Fourier Coefficients 


0.6904 

40.1310 

0.5473 

4.0634 

- 6.4863 
I 1.7245 
-4.1354 
25.0916 

- 3.5037 

- 4.2985 
14.1744 
10.8471 

- 5.0269 

- 1.7398 
■ 3.7861 


B, 

- 5.6544 

- 23.8158 

4.3761 

- 5.7846 
12.71 10 

3.0638 

- 7.7980 

- 70.9743 

- 7.0514 
7.7275 

14.6379 
10.0918 
I 1.6832 
18.9801 

- 7.1347 


Squared Amplitude 

Rr 

32.4486 

2177.6869 

19.4501 

49.9723 

203.6147 

146.8498 

77.9098 

5666.9430 

61.9981 

78.1916 

415.1840 

219.5032 

161.7665 

363.2730 

65.2379 




220 


OSCILLATORY AND PERIODIC MOVEMENTS 


[9.1 


Example 1. H. T. Davis 4 gives the Fourier analysis for the monthlv 

series oL freight car loadings , 1919-32. The mean of the series is 878.42, 
and its variance 23,870.2441. 

We reproduce in Table 2 the Fourier coefficients and squared ampli¬ 
tudes. 

Tests of significance for these results will be given in section 9.2, 
Example 1 . 

Example 2. We consider now the Fourier coefficients for selected 

periods of the deviations from the cubic trend of the yearly meat cfuantity 

series. See section 8.1, Example 2. The period covered is 1919—41 
N = 23. 

To illustrate the construction of Table 1 we investigate, for instance, 

the amplitude of a 5-year cycle (P = 5). Then we have Table 3 from 
our data. 


5.6953 

5.6150 

- 2.1324 
6.4603 

- 2.4558 

- 0.0622 
- 0.8540 

- 10.2246 

TABLE 3 

5-Year Cycle 

- 7.4268 

- 0.7227 
0.7229 
2.5796 

- 4.0501 

- 0.7994 

2.0284 

- 2.4547 

5.5414 

- 2.6251 
7.9650 

- 5.1755 

15.6382 

- 13.5966 

- 4.8470 

- 5.2808 

5.7078 


The last row is the sum of the columns above. We use the tables given 
in M. G. Kendall's pamphlet 0 and obtain for A- 0 and B b with mP = 20: 


(9) A 5 


2[( 1 5.6382)(0.3090) + (- 13.5966)(— 0.8090) + 
(- 4.8470)(— 0.8090) - (- 5.2808)(0.3090) -f 
(5.7078)0.0000)] 

20 


= 2.383 


2[( 15.6382)(0.951 I) + (~ 13.5966)(0.5878) + 

(- 4.8470X- 0.5878) - (- 5.2808)(- 0.951 1) -f 
„ (5.7078)(0.000000)1 

(10) B - - —----- = 1.475 

° 20 


Hence we have for the squared amplitude: 

(M) Rf =, (| .475 ) 2 -f- (2.383F = 7.354 


4 H. T. Davis: op. cit ., pp. 77 ff. 

J M. G. Kendall: op. cit., pp. 69 ff. 



9.1] 


FOURIER ANALYSIS 


221 


The other squared amplitudes given in Table 4 are computed by the 
same method. 


TABLE 4 


Fourier 

Analysis of 

Quantity of 

Meat Consumed, 

1919-41 




Squared 

Number of 

Period 

Fourier Coefficients 

Amplitude 

Items 

P 

A p 

B P 

*r 2 

mP 

2 

0.728 

0.000 

0.530 

22 

3 

1.077 

1.387 

3.084 

21 

4 

- 1.832 

0.505 

3.610 

20 

5 

2.383 

1.475 

7.854 

20 

6 

0.895 

1.177 

2.186 

18 

7 

2.91 1 

0.298 

8.563 

21 

8 

2.651 

3.328 

18.103 

16 

9 

1.155 

- 4.283 

19.678 

18 

10 

- 2.440 

- 0.578 

6.288 

20 

11 

- 0.949 

0.761 

1.480 

22 


Fourier 

Analysis, 

TABLE 5 

All Commodity 

Wholesale 

Price Index 

Period 

Fourier Coefficients 

Squared 

Amplitude 

Number of 
Observations 

P 


B P 

*/> 2 

mP 

2 

- 1.448 

0.000 

2.097 

52 

3 

1.922 

0.998 

4.690 

51 

4 

0.373 

- 0.462 

0.353 

52 

5 

0.522 

- 0.846 

0.998 

50 

6 

1.367 

- 2.295 

7.136 

48 

7 

1.291 

0.266 

1.737 

49 

8 

- 0.926 

- 2.472 

6.968 

48 

9 

0.762 

2.616 

7.424 

45 

10 

0.523 

- 1.944 

4.053 

50 

1 1 

1.340 

0.087 

1.803 

44 

12 

0.316 

- 0.130 

0.117 

48 

13 

0.104 

- 0.021 

0.01 1 

52 

14 

0.477 

0.499 

0.476 

42 

15 

0.321 

1.216 

1.582 

45 

16 

0.776 

0.197 

0.641 

48 

17 

0.872 

0.578 

1.094 

51 

18 

0.205 

0.834 

0.737 

54 

19 

0.091 

0.695 

0.491 

38 

20 

0.680 

0.732 

0.998 

40 



222 


OSCILLATORY AND PERIODIC MOVEMENTS 


[ 9.2 


Tests of significance of these findings are given in section 9.2, 
Example 2. 

Example 3. We analyze now the N = 52 items of the deviations of 
the all commodity wholesale price index from its trend, 1891-1942. The 
trend is a moving average with variable length. This procedure has 
been described in section 8.2.1, Example I. 

Table 5 gives the Fourier analysis of the data. 

Tests of significance for these results will be given in section 9.2, 
Example 3. 


9.2 Periodogram Analysis 

It has been pointed out before that the business cycle is a movement 
which is approximately periodic with no fixed period. Hence the tech¬ 
nique of harmonic analysis is particularly useful in econometric work. 
These methods may, however, also be used for an investigation of the 
seasonal variation. An alternative and related method is the process of 
stochastic linear autoregression, which will be discussed in section 10.6.2. 

It is the purpose of harmonic analysis to test the significance of the 
amplitudes derived in the Fourier analysis. 1 We are searching for “hidden 
periodicities.” 

We compute the amplitudes according to the formulae: 


(1) 

( 2 ) 


2 N 

A n = T> 2 X* COS 
2 £ 

B n = t; 2 X t sin 

iy t=i 




or by the method indicated in section 9.1. 
The squared amplitude is: 



*n 2 = 


A 2 

n 


T- B n 


2 


If there are no periodic fluctuations we have: 2 




4cx 2 

~N 


where a 2 is the variance of the series X t . This series is in this case sup¬ 
posed to be a pure random (non-autocorrelated) series of N terms. The 
items follow also a normal distribution. 


1 H. T. Davis: The Analysis of Economic Time Series (Bloomington, Ind., 
1941), pp. 183 AT. M. G. Kendall: The Advanced Theory of Statistics, vol. 2 
(London, 1946), pp. 423 ff. 

2 H. T. Davis: op. cit ., p. 186. 


9 . 2 ] 


PERIODOGRAM ANALYSIS 


223 


There are three tests available: 

(a) Schuster Test. According to Schuster , 3 the probability that an 
empirical squared amplitude R 2 is k times the mean squared amplitude 

*ji/ 2 

(5) P s = e~* 

Hence we have: 

( 6 ) * = - log. P s 

Table 1 gives the value of k for selected probabilities P s . 


TABLE 1 


Ps 

K 

0.001 

6.9 

0.005 

5.3 

0.010 

4.6 

0.050 

3.0 


( h) Walker Test.* We require sometimes the probability that at least 
one squared amplitude R n 2 will be k times the mean squared amplitude 
R m . Let this probability be P u . Then we have: 

(7) P w = 1 - (1 - e-*) NI2 

Tables for this function have been provided by H. T. Davis. 5 
(c) Fisher Test. 6 The Schuster test and the Walker test require a 
knowledge of the population variance a 2 . This quantity enters into the 
computation of the mean squared amplitude R y . But this knowledge is 
seldom available with economic time series. 

Fisher provided a test for the following proposition: Let R n 2 be the 
largest of the squared amplitudes R P 2 . The probability that 

R 2 

(8) G = — 

2 V 


3 A. Schuster: “On hidden periodicities,” Terrestrial Magnetism , vol. 3 

(1898), pp. 13 ff.; “The periodogram and its optical analogy,” Proceedings of 
the Royal Society of London , series A, vol. 77 (1906), pp. 136 fT.; “On the 
periodicities of sunspots,” Philosophical Transactions of the Royal Society of 
London , series A, vol. 206 (1906), pp. 69 ff. ' ' 

4 S |r Gilbert Walker: “On the criterion for the reality of relationships or 
periodicities, Calcutta Indian Meteorological Memoirs , vol. 21 (1914), part 9; 

On periodicity,” Quarterly Journal of the Meteorological Society, vol 51 (1925) 
pp. 387 ff. 7 


5 

6 


H. T. Davis: op. c/7., pp. 583 ff. 

R. A. Fisher: “Tests of significance in harmonic analysis,” Proceedings of 
the Royal Society of London , series A, vol. 125 (1929), pp. 54 ff. 



224 


OSCILLATORY AND PERIODIC MOVEMENTS 



(where V is the sample variance of X,) is greater than a given value g is 


(9) 


m 

r=I(- D 1 

i = 0 


n 

i 


(* - ig) 


W— 1 


Here m is the greatest integer less than \/g 
H. T. Davis has tabulated function (9). 7 


( 10 ) 


We have evidently 


k = n* g 


It is, however, very dubious whether these methods can give us valid 
results with economic data. E. B. Wilson 8 9 10 has carried out a very thorough 
periodogram analysis of American business activity, 1790-1929. He 
comes to essentially negative conclusions about the applicability of the 
method of periodogram analysis. 

This is not surprising because of the fact that the business cycle is 
probably not strictly periodic. The assumption of the normal distribution 
of the errors or deviations, and especially the hypothesis that they are 
not autocorrelated, are in all likelihood not fulfilled in economic data. 
Hence the tests indicated above may have very little applicability as far 
as economic time series are concerned. 

H. WoId J has shown that the method should be used with great caution. 
The correlogram (section 10.6) of the scheme of hidden periodicities is 
a harmonic with constant amplitude. It is especially not applicable if 
the errors or disturbances are autocorrelated. The relationship between 
the method of periodogram analysis and correlogram analysis will be 
discussed in section 10.6. 

Example 1. We investigate further the monthly series of freight car 
loadings given by H. T. Davis 1 ' 1 described in section 9.1 on Fourier analysis 
(Example 1). We have, for instance, for a period of 6 months the squared 
amplitude Rf = 2177.6869, and for a period of 12 months the squared 
amplitude R 2 l2 = 5666.9430. 

The length of the series is N = 168. The empirical variance is o 2 = 
23,870.2441. This may be a pretty good approximation to the population 
variance because of the considerable length of our series. From this 
quantity we can compute the estimated mean amplitude for a random 


7 H. T. Davis: op. cit ., pp. 601 ff. 

8 E. B. Wilson: “The periodogram of American business activity,” Quarterly 
Journal of Economics , vol. 48 (1934), pp. 375 ff. See also B. Greenstein: 
“Periodogram analysis with special application to business failures,” Econo- 
metrica , vol. 3 (1935), pp. 170ff. 

9 H. Wold: A Study in the Analysis of Stationary Time Series (Uppsala, 1938), 
pp. 25 ff. 

10 H. T. Davis: op. cit., pp. 78 ff. 


9 . 2 ] 


PERIODOGRAM ANALYSIS 


225 


series without periodic fluctuations from formula (4): R u 2 — (4) 
(23,870.2441)/168 = 568.339. 

The tw'o ratios * 6 and k 12 are k 6 = R 6 2 /R y[ 2 = 3.83 and k 10 = R 2 12 /R „ 2 
= 9.97. 

If we apply the Schuster test we see immediately that both values are 
significant on the 5 per cent level. Here only k = 3.0 is required. The 
quantity k 12 is also significant at the 0.1 per cent level. At the 0.1 per cent 
level of significance only a k of 6.9 is permissible. Hence it is extremely 
unlikely that such large ratios as the ones observed for periods of 6 and 12 
months should have arisen by chance. 

Using the Walker test of significance, we obtain for the level of signifi¬ 
cance of 5 per cent a * of about 7.3. The empirical value k 6 is smaller 
than this value. Hence it is not significant from the point of view of the 
Walker test. 

Finally, the Fisher test gives for a level of significance of 5 per cent 
a k of about 7.4. Again, our empirical value k g is not significant. 

Needless to say, the two periods of 6 months and 12 months represent 

nothing else but the seasonal movement of freight car loadings, which is 

very pronounced. Hence the seasonal movement can be approximated 
by the formula: 

(II) Y t = — 40.1310 cos 60/ — 23.8158 sin 60/ — 25.01916 cos 30/ — 

70.9743 sin 30/ 

This can also be written: 


(12) Y, = 47.524 sin (60/ - 30 40') + 32.036 sin (30/ - 70° 35') 


This representation of the seasonal movement is justified because we 
know that the season is a periodic movement with a period of 12 months. 

Example 2. We analyze now the series of the quantity of meat consumed , 
1919-41, for hidden periodicities. The series consists of /V = 23 terms! 
(See section 9.1, Example 2.) We give in Table 2 the squared amplitudes 
R„ , the ratio of the squared amplitudes to the mean squared amplitude k 
The variance of the series is a 2 = 23.833. Hence the mean squared 
amplitude is R u 2 = (4)(23.833)/23 - 4.144 [formula (4)]. 

Let us first test our results with the Schuster test. We see from Table 1 

that we want at the 5 per cent level of significance a k — 3.00. Actually. 

for the periods of 8 and 9 years we have results significant at this level of 

significance; * 8 is not significant at the 1 per cent level, where a * of 
4.60 is required. 


Lor the Walker test we want for N = 
significance a k at least as great as 5.2. 
values is significant. 


20 at the 5 per cent* level of 
Hence none of our empirical 



226 


OSCILLATORY AND PERIODIC MOVEMENTS 


[9.2 


TABLE 2 


Period 

Squared Amplitude 

Ratio of Squared Amplitude 
to Mean Squared Amplitude 

n 

R 2 

Ix n 

K 

2 

0.530 

0.128 

3 

3.084 

0.744 

4 

3.610 

0.871 

5 

7.854 

1.893 

6 

2.186 

0.527 

7 

8.563 

2.066 

8 

18.103 

4.367 

9 

19.678 

4.747 

10 

6.288 

1.517 

11 

1.480 

0.357 


For the Fisher test we require for N = 20 at the 5 per cent level of 
significance a k at least as large as 5.4. Again, none of our empirical 
ratios is significant. 


TABLE 3 

All Commodity Wholesale Price Index 


Period 

Squared Amplitude 

Ratio of Squared Amplitude 
to Mean Squared Amplitude 

n 

R,r 

K 

2 

2.097 

0.722 

3 

4.690 

1.616 

4 

0.353 

0.121 

5 

0.988 

0.340 

6 

7.136 

2.450 

7 

1.737 

0.599 

8 

6.968 

2.903 

9 

7.424 

2.557 

10 

4.053 

1.396 

1 1 

1.803 

0.621 

12 

0.1 17 

0.040 

13 

0.01 1 

0.004 

14 

0.476 

0.164 

15 

1.582 

0.545 

16 

0.641 

0.221 

17 

1.094 

0.377 

18 

0.737 

0.254 

19 

0.491 

0.169 

20 

0.998 

0.344 


9 . 3 ] 


ELIMINATING SEASONAL FLUCTUATIONS 


227 


Hence we may conclude tentatively that there is possibly a period of 
about 8 or 9 years in the data. This period is, however, somewhat 
doubtful since only the Schuster test yields significant results It is of 
course poss.b! 6 and even likely that there are no hidden periodicities and 
that there is instead a linear scheme of autoregression (section 10 6 2) 

k W°T 3 ‘ We , ana ' yZe n ° W the ;V = 52 ite ™ °f the all commodity 
Wholesale pace mdex , 1891-1942. The data are yearly. They are 

deviations from a trend which is itself a moving average of variable length 

(See section 9.1, Example 3.) S 

The variance of the series is a 2 = 37.733. The mean squared amplitude 

is hence = 2.903 [formula (4)]. mpntude 

We give in Table 3 the periods, squared amplitudes, and ratios 
If we want to use the Schuster test we require at the 5 per cent level of 
significance a k of at least 3.00. None of our empirical ratios is as laree 

aS For S the £ '/h K f ° r a Peri ° d ° f 8 (2 903) iS a ' m0St ^"^ant. 

at f e ° r st th 6 e , W “* er '"! We Want . for N = 50 ^ the 5 per cent level a * of 
at least 6.1. None of our empirical ratios is nearly as great 

, ™' f ,7 5 Tr " " 50 5 per ce„, level of s i s „ lfi ca„ce 

° at least 6.5. This is also larger than our empirical values 

in "Tta.' "ST* - ' .‘ h H " ***** "° hidd “ periodicities 
■ n the data. But this result does not preclude the existence of a |i nM r 

scheme of autoregression (stochastic difference equation) (sections 10.3, 

9.3 Wald’s Method of Eliminating Seasonal Fluctuations 

,o A TM^r^^hod °f eliminating seasonal fluctuations- is due 

1 a n- 11 ,sbasedu P° n the foUowing assumptions- 

l. I he difference between the sum of the trend md , 

centered 12-month moving average is negligible ^ C ' VC C ™ US 

ponern^are very'near tc^zero"' """‘h* m “" S ° f ‘ he ra " d - com- 

strictly h periodic"w'ith 'a' pehod ^o f ^\2 months. °n"'ItherT^' " 

but it changes its value only slowly with time. P ^ penodlc 

H Mendershausen: “Annual survey of statktiral • 

Si;;t:'r atins changin8 — 1 0 ° f 

1 . . . 

York IQvn r\ jn • “Hons in Industry and Trade (New 

iork, 1933 ). O. Donner: “Saisonschwankuneen ah Prnhi^ 3 ^ ( eu 

forschung,” Vierteliahrshedn> tr„- k ■ t ^ robltm der Konjunktur- 

1928). y " y,< /w ^'"’Oiaforschuny (Sonderheft 6, Berlin, 



228 OSCILLATORY AND PERIODIC MOVEMENTS [9.3 


These assumptions seem to be very reasonable from an economic point 
of view. 

On the basis of these assumptions Wald computes the seasonal variation 
as follows: 


Let fij (/ = 1, 2, • • • N; j = 1, 2, • • • 12) be the value of the original 
series in year / and month j. We form the centered moving 12-month 
moving averages (8.2): 

5 


(1) /*,; = 


fij— 6 “b ^ 2 fij + k fij + 6 

-- (/= 1, 2, • • • N; j= 1,2,--- 12) 

24 


It should be noted that these averages are correctly centered, and hence 
we take 1 / 24 of the values 6 months distant and 1 / 12 of the values 5, 4, • • • 1, 
0 months distant from the center of the average. Formula (1) gives 
the centered moving 12-month average for the jth month of the /th year. 

Then we form for each month the arithmetic mean of the /„• —/* 
(this is the average for each month): 








(j= 1, 2, • • • 12) 


If extreme values appear, they ought to be neglected in the formation 
of this average. 

The values are corrected in order to make their sum zero: 

M (2 *) 

( 3 ) a/ = a, - 1 2 1 — 

2 hi 

)'= l 

Here we denote by |a t -| the absolute value of a i9 i.e., its value without 
regard to the sign. 

The seasonal fluctuation is computed by the formula. 


( 4 ) = 

a/[ J 2 a k \f ik -r ik ) + m^fii^ 

U = i- 5 __ 


—/* ti + 6 ) + <*j- 6 (Jij- 6 /*y-«)l} 


12 

2 («.') 2 

(/ = 1,2,- • -N;j= 1.2,- ■ • >2) 

In order to eliminate the season we form: 

(/ = 1, 2, • • • N; j = 1. 2 > ' ' ’ 12) 


( 5 ) 



9 . 3 ] 


ELIMINATING SEASONAL FLUCTUATIONS 


229 


This series should be reasonably free from the seasonal movement, 
if the fundamental assumptions underlying the procedure are satisfied. 

Wald gives also further methods of closer approximation to the chaneing 
seasonal, which we will not present here. 

More complicated methods have been proposed by other authors. 3 
They involve the use of correlation procedures. 

It should again be emphasized that Wald’s method is based upon the 

assumption that the seasonal is a component added to the other parts of 

the empirical series. This assumption is not always strictly justified with 
economic time series. 

Example 1. Wald illustrated his procedure by the elimination of the 

seasonal from the monthly series of the number of persons utilizing the 

municipal streetcars in Vienna. 4 We give in Table 1 the values of_V , 

i.e., the difference between the empirical numbers and the corresponding 
moving 12-month averages (1). 


Difference between 


TABLE 1 

the Series and Moving 12 


-Month Averages 


Year 

1925 

1926 

1927 

1928 

1929 

1930 

1931 

1932 

1933 

1934 

1935 


1 

172 

160 

200 

260 

242 

100 

106 

104 

82 

144 


524 

672 

456 

827 

535 

578 

362 

384 

658 

481 


30 
100 
20 
200 
210 
- 11 
- 31 
72 
157 
2 


Months 

4 5 6 7 8 

- 160 - 381 

160 350 60 — 60 — 547 

180 310 170 — 190 — 585 

30 250 230 40 — 505 

40 430 200 - 90 - 530 

HO 450 310 - 406 - 510 

55 524 215 — 71 — 493 

173 393 107 - 220 - 450 

37 204 85 -165 - 451 

173 330 118 - 232 - 568 

134 303 235 


9 

80 

10 

650 

170 

30 

- 85 
145 

- 3 
155 

99 


The column means a, are computed from formula 


( 2 ): 


a 


1 2 
- 157 - 548 


3 

75 


TABLE 2 

a > 

Months 

4 5 6 7 8 

109 354 173 -177 - 502 


9 

35 


10 

520 

420 

110 

400 

270 

324 

437 

131 

306 

270 


10 

319 


11 

- 20 
70 
- 40 
90 
110 
93 
43 
1 

79 

51 


11 

32 


12 

50 

210 

130 

320 

200 

295 

99 

486 

81 

184 


12 

205 


F. I. Zrza 


cy " ica ' - m 

A. Wald: op. cit., pp. l 17 ff 



230 


OSCILLATORY AND PERIODIC MOVEMENTS 


[9.3 


The corrected values of a- are computed according to formula (3): 

TABLE 3 


/ 

a i 

Months 


a i 

1 2 
- 152.22 - 531.29 

3 

77.29 

4 

112.32 

5 

364.79 

6 

178.27 

7 

- 171.60 

8 

-486.70 



9 

36.07 

10 

328.73 

11 

32.98 

12 

211.25 




Finally, we give in Table 4 the seasonal movement, which is computed 
according to formula (4). 


TABLE 4 

Seasonal Variation s,j 


Year 

1 

2 

3 

4 

5 

Months 

6 

l 

► 

7 

8 

9 

10 

11 

12 

1926 

- 144 - 

491 

79 

113 

356 

175 - 

175 

- 494 

41 

364 

36 

231 

1927 

- 169 - 

605 

90 

133 

392 

191 - 

181 

- 515 

35 

304 

30 

186 

1928 

— 136 — 

451 

62 

88 

326 

160 - 

162 

- 463 

44 

389 

39 

265 

1929 

— 190 - 

677 1 

100 

144 

451 

221 - 

208 

- 587 

39 

340 

34 

222 

1930 

- 163 - 

602 

87 

125 

415 

203 - 

200 

- 553 

43 

375 

37 

246 

1931 

— 174 — 

575 

83 

120 

406 

198 - 

183 

- 519 

35 

309 

31 

190 

1932 

— 134 — 

482 

69 

100 

285 

139 - 

149 

- 423 

33 

292 

29 

168 

1933 

— 121 - 

415 

61 

88 

311 

152 - 

130 

- 367 

34 

302 

31 

208 

1934 

- 151 - 

533 

83 

119 

384 

189 - 

186 

- 532 

37 

321 

32 

203 

Example 2. 

In the following we apply the Wald method 

of eliminating 

seasonal variations 

to 

the monthly 

w holesale farm price index , 

1941-4/. 


We present the data in Table 5. 


TABLE 5 





12-Month 

Deviation from 

Year 

Month 

Index 

Average 

12-Month Average 

/ 

• 

./ 

.fn 

j * o 

fn - r» 

1941 

1 

80.8 




2 

80.6 




3 

81.5 




4 

83.2 




5 

84.9 




6 

87.1 




7 

88.8 

87.925 

0.875 


8 

90.3 

89.228 

1.072 


Season 

s,i 




Year Month 

' j 

1941 9 

10 
11 
12 

1942 1 

2 

3 

4 

5 

6 

7 

8 
9 

10 
1 1 
12 

1943 1 

2 

3 

4 

5 

6 

7 

8 
9 

10 
1 1 
12 

1944 1 

2 

3 

4 

5 

6 

7 

8 
9 

10 
1 1 
12 


ELIMINATING SEASONAL FLUCTUATIONS 231 


Index 

fij 

91.8 

92.4 

92.5 

93.6 
96.0 

96.7 

97.6 

98.7 

98.8 

98.6 

98.7 
99.2 
99.6 

100.0 

100.3 
101.0 
101.9 

102.5 

103.4 

103.7 

104.1 

103.8 

103.2 
103.1 

103.1 
103.0 

102.9 

103.2 

103.3 

103.6 

103.8 

103.9 
104.0 

104.3 
104.1 
103.9 
104.0 
104.1 

104.4 

104.7 


TABLE 5—( cont'd ) 


Deviation from 

12-Month Average Season 


12-Month 

Average 

/*« 
90.570 
91.886 
93.112 
94.169 
95.061 
95.844 
96.541 
97.183 
97.824 
98.457 
99.012 
99.500 
99.984 
100.434 
100.865 
101.301 
101.705 
102.055 
102.364 
102.636 
102.869 
103.068 
103.218 
103.322 
103.383 
103.408 
103.412 
103.429 
103.488 
103.558 
103.628 
103.713 
103.828 
103.945 
104.075 
104.209 
104.339 
104.476 
104.634 
104.729 


fii-f'u 

1.230 

0.514 

-0.612 

- 0.569 
0.939 
0.856 
1.059 
1.517 
0.976 
0.143 

- 0.312 

- 0.300 

- 0.384 

• 

- 0.434 

- 0.565 

- 0.301 
0.195 
0.445 
1.036 
1.064 
1.231 
0.732 

- 0.018 
- 0.222 

- 0.283 

- 0.408 
-0.512 

- 0.229 
-0.188 

0.042 

0.172 

0.187 

0.179 

0.355 

0.025 

- 0.309 

- 0.339 

- 0.376 

- 0.234 

- 0.092 


- 0.0147 

- 0.0348 

- 0.1 149 
0.0020 
0.0357 
0.0479 

- 0.0196 

- 0.0342 
0.0533 

- 0.0092 

- 0.0377 

- 0.0042 
0.0088 
0.0165 
0.0697 
0.0469 

- 0.0748 

- 0.1341 
0.0527 
0.0581 

- 0.0163 

- 0.0062 
0.0250 
0.0225 
0.0036 
0.0047 
0.0203 

- 0.0018 

- 0.0391 

- 0.091 I 
0.0401 
0.0509 

- 0.0634 
0.0100 
0.0186 
0.0085 



232 


OSCILLATORY AND PERIODIC MOVEMENTS 


[ 9.3 


TABLE 5 — ( cont'd) 





12-Month 

Deviation from 


Year 

Month 

Index 

Average 

12-Month Average 

Season 

• 

1 

• 

/ 

/o 

f*.. 

./ u 

fis-r*s 

s u 

1945 

1 

104.9 

104.942 

- 0.042 

0.0021 


2 

105.2 

105.092 

0.108 

0.0533 


3 

105.3 

105.216 

0.084 

0.0077 


4 

105.7 

105.342 

0.358 

- 0.0041 


5 

106.0 

105.517 

0.483 

- 0.2024 


6 

106.1 

105.716 

0.384 

-0.1892 


7 

105.9 

105.908 

-0.008 

0.0831 


8 

105.7 

106.104 

-0.404 

0.1027 


9 

105.2 

106.359 

- 1.159 

-0.0814 


10 

105.9 

106.696 

- 0.796 

- 0.0306 


11 

106.8 

107.092 

- 0.292 

- 0.2023 


12 

107.1 

107.583 

- 0.483 

0.5369 

1946 

1 

107.1 

108.649 

- 1.549 

0.1563 


2 

107.7 

110.408 

- 2.708 

0.4553 


3 

108.9 

112.166 

- 3.266 

2.4112 


4 

110.2 

114.124 

- 3.924 

-0.1964 


5 

110.0 

116.670 

- 5.670 

- 3.2815 


6 

112.9 

119.449 

- 6.549 

- 6.0700 


7 

124.7 

122.291 

2.409 

2.5151 


8 

129.1 

125.259 

3.841 

3.1680 


9 

124.0 

128.484 

- 4.484 

- 4.8210 


10 

134.1 

131.738 

2.362 

0.9310 


11 

139.7 

134.804 

4.896 

2.7426 


12 

140.9 

137.754 

3.146 

1.2940 

1947 

1 

141.5 

140.279 

1.221 



2 

144.5 

142.379 

2.121 



3 

149.5 

144.792 

4.708 



4 

147.7 

147.201 

0.499 



5 

147.1 

149.042 

- 1.942 



6 

147.6 

150.796 

- 3.196 



7 

150.6 





8 

153.6 





9 

157.4 





10 

158.5 





11 

159.5 





12 

163.2 





Column 4 of Table 5 is the centered 12-month moving average, com¬ 
puted from formula (1): We have, for instance, for the value of the 
average for July 1941: (V 24 X 8 O. 8 ) -f- ( 7 i 2 )( 80.6 -f- 81.5 -f 83.2 -f 84.9 -f- 



9.3] 


ELIMINATING SEASONAL FLUCTUATIONS 


233 


87.1 4- 88.8 + 90.3 + 91.8 + 92.4 f 92.5 + 93.6) 4 (7 24 )(96.0) = 87.925. 

The next column is the difference between the index and the moving 
average. We have, for instance, again for July 1941: 88.8 — 87.925 = 
0.875. 

By averaging the values of the data in column 5 for each month we get 
the values of a { [formula (2)]. These are presented in Table 6. 


TABLE 6 
Season 


Month 

Oi 

t 

a. 

1 

0.096 

0.101 

2 

0.144 

0.151 

3 

0.632 

0.662 

4 

- 0.050 

- 0.048 

5 

- 0.790 

- 0.752 

6 

- 1.355 

- 1.290 

7 

0.495 

0.519 

8 

0.613 

0.642 

9 

- 0.903 

- 0.860 

10 

0.144 

0.151 

11 

0.447 

0.468 

12 

0.245 

0.256 


We have, for instance, for the average of the values in column 5 of 
Table 5 for all Januaries: (0.939 4 0.195 — 0.188 — 0.042 — 1.549 4 
1.221 )/6 = 0.096. These values are given in column 2 of Table 6. The 
a i have been adjusted with the help of formula (3) so that their sum is zero. 

We have in the last column of Table 5 the estimated season, calculated 

from formula (4). We have, for instance, for the estimated value of the 

seasonal movement in January 1942: 0.101 [(V 2 )(0.519)(0.875) -f- (0.642) 

(1.072) -f- (- 0.860)(1.230) 4 (0.151 )(0.514) 4 (0.468)(- 0.612) 4 (0.256) 

(- 0.569) 4 (0.101)(0.939) 4 (0.151 )(0.856) 4 (0.662)(1.059) 4 (- 0.048) 

(1.517) 4 (- 0.752X0.976) 4 (- 1.290)(0.143) 4 ( l / 2 )(0.519)(—0.312)]/ 

[(0.101) 2 4 (0.151) 2 4 (0.662) 2 4 (— 0.048) 2 4 (— 0.752) 2 4 (— 1.290) 2 

4 (0.519) 2 4 (0.642) 2 4 (- 0.860) 2 4 (0.151 ) 2 4 (0.468) 2 4 (0.256) 2 ] 
= - 0.0147. 


The computed seasonal movement is given in the last column of Table 5. 

It is evident that there is a stronger seasonal movement in later years 

than in the earlier period. Our computed seasonal is valid only if the 

seasonal movement is actually an additive component in the empirical 
time series. 



234 


OSCILLATORY AND PERIODIC MOVEMENTS [9.4 

9.4 A Non-parametric Test for Cyclical Fluctuations 

Most tests for cyclical fluctuations are parametric; i.e., they are based 
upon the assumption of a specific kind of distribution of the errors or 
deviations, the normal distribution. They also assume more or less 
regularity concerning the amplitudes. Such tests have been presented 
in the section on periodogram analysis, section 9.2. But W. A. Wallis 
and G. H. Moore 1 have devised a non-parametric 2 test which has the 
advantage that it is valid for any kind of distribution. It bases itself 
simply upon combinatorial considerations of the observed number of 
phases and turning points. It is a test for randomness. Naturally it is 
also less efficient than other tests, but it is computationally much simpler. 

A turning point is a peak if it is a relative maximum and a trough if 
it is a relative minimum. The length or duration of a phase is the number 
of time units between turning points. 

Alternatively, the length of a phase is the length of the sequence of like 
signs (-f or —) in the first differences. 

In a series of N independent random observations the expected number 
of completed runs of length d of the signs of first differences (d + signs 
or d — signs) is: 

(I) 2 (d 2 + 3 d+ \)(N — d — 2) 

(d + 3)! 


Specifically, the expected number of runs of 1 is: 




5 (^- 3 ) 

12 


The expected number of runs of 2 is: 




1 \(N-4) 
60 


The expected number of runs longer than 2 is: 





60 


1 W. A. Wallis and G. H. Moore: “A significance test for time series analy¬ 
sis," Journal of the American Statistical Association , vol. 30 (1941), pp. 401 ff.; 
“A significance test for time series," Technical Paper 1, National Bureau of 
Economic Research (New York, 1941). 

2 H. Scheffer "Statistical inference in the non-parametric case," Annals of 
Mathematical Statistics , vol. 14 (1943), pp. 305 ff. 



9.4] NON-PARAMETRIC TEST FOR CYCLICAL FLUCTUATIONS 235 


We may compare these expected frequencies (2), (3), and (4) with the 

empirical frequencies by a criterion, called X *. This is constructed in 

a fashion analogous to x 2 . Tables are provided which permit the test 
of significance. 

Let the number of runs of 1 observed in an empirical series be u,. 

There are u 2 runs of length 2 and u runs of greater length. Then we 
compute the criterion Xb 2 by the formula: 


(5) 


2 _ K~ t/,) 2 (« 2 - t/ 2 ) 2 (u - Uf 

Vi + ~uT~ + —u— 


This is approximately distributed like (*/ 7 ) x * for 2 degrees of freedom if 

r P If K IS largCr than this va,ue li foI,ows the ^-distribution 

fr ^om. Tables for the test are provided in the paper 
ot Wallis and Moore. 3 v v 

An auxiliary test is based upon the total number of phases 4 or the total 
number of completed runs in the first differences u x + u, + u This 
quantity is for large samples normally distributed with mean'- 


( 6 ) 

and variance: 

(?) 


IN -1 


16 N - 29 
~ 90 


nn,n, ampe u' G Kendal15 g ives as an example the deviation of 
potato yteldstn England and Wales from their moving 9-year averages- 

ences. reSent ^ ^ Tab ‘ e ‘ to S ether with the si gns of the first differ- 

TABLE 1 

Devutions of the Potato Y.elds from 9-Year Moving Averages 


Year 

1888 

1889 

1890 

1891 

1892 


Deviation 

- 6 
2 

- 4 

- 3 

- 1 


Sign of First Difference 


+ 


+ 


3 W. A. Wallis and G. H Mnnrp■ “a a 
Technical Paper 1 National R*„ rM r e g n «hcance test for time series,” 

i Ibid. ' ’ Bureau of Economic Research (New York, 1941 ). 

pp 5 Kendal ' : ^ The ° r >’ of Statistics, vol. 2 (London, 1946), 



236 

OSCILLATORY AND PERIODIC MOVEMENTS [9.4 



TABLE 1— (cont'd) 



Year 

Deviation Sign 

of First Difference 


1893 

6 

+ 


1894 

- 2 

— 


1895 

7 

+ 


1896 

3 

— 


1897 

- 6 

— 


1898 

2 

+ 


1899 

0 

— 


1900 

- 7 

— 


1901 

6 

+ 


1902 

- 3 

— 


1903 

- 7 

— 


1904 

2 

+ 


1905 

0 

— 


1906 

1 

+ 


1907 

- 7 

— 


1908 

8 

+ 


1909 

4 

— 


1910 

3 

— 


1911 

4 

+ 


1912 

- 15 

— 


1913 

3 

+ 


1914 

2 

— 


1915 

1 

— 


1916 

- 2 

— 


1917 

* 

mS 

+ 


1918 

4 

— 


1919 

-4 

— 


1920 

- 3 

+ 


1921 

- 9 

— 


1922 

11 

+ 


1923 

- 1 

— 


1924 

- 1 

0 


1925 

2 

+ 


1926 

- 9 

— 


1927 

- 3 

+ 


1928 

9 

+ 


1929 

5 

— 


1930 

1 

— 


1931 

- 10 

— 


1932 

1 

+ 


1933 

2 

+ 


1934 

5 

+ 


1935 

- 4 

— 



9.4] NON-PARAMETRIC TEST FOR CYCLICAL FLUCTUATIONS 


237 


The number of observed phases of a given length and the number 
expected according to formulae (2), (3), and (4) are given in Table 2. 


TABLE 2 


Length of Phase 

Number 


Observed 

Expected 

1 

22 

18.75 

2 

6 

8.07 

3 and over 

4 

3.18 

Total 

32 



We have y 2 = 1.305 for a series of length N = 48. This is clearly 
not significant. According to the table, at the 5 per cent level a y 2 as 
large as 6.898 is permissible. 

Hence we conclude that there is no evidence of cyclical fluctuations in 
the distribution of phase lengths of yields, if the yields are deviations from 
9-year moving averages. 

The total number of completed runs in the signs of the first differences 
(sum of column 2 in Table 2) is 32. This quantity is, according to 
formulae (5) and (6), normally distributed with mean [(2)(48) — 7]/3 = 
29.67 and variance [(16)(48)— 29]/90 = 8.21. The deviation of our 
empirical total number of completed runs from the mean is so small that 
the probability of this or a larger positive or negative deviation is about 
0.4. Hence the empirical number of runs is not significant, and we have 
no reason to consider the series periodic or oscillatory. 

Example 2. We give next an analysis of the phases of the all commodity 

wholesale price index , 1890-1947, presented in section 8.2.1, Example 1. 

We have N = 58. Table 3 gives the observed and expected phases of 
various lengths. 


TABLE 3 

Length of Phase Number 

Observed Expected 

1 12 22.916 

2 6 9.000 

3 or more 6 3.517 

Total 24 

From these data we derive *„ 2 = 7.953. This is distributed like * 2 
with 2.5 degrees of freedom. Suppose that we choose the 5 per cent 
level of significance for our test. Then we have from a table given in 
Wallis and Moore's paper the permissible value of y 2 as 6.898. Hence 
the deviation of the behavior from a random series is significant. 



238 


OSCILLATORY AND PERIODIC MOVEMENTS 


[9.4 


Alternatively, we might consider the distribution of all total completed 
phases. This number is 24 in our empirical series (Table 3). For N = 58 
the total number of completed phases is approximately normally distri¬ 
buted with mean 36.333 and variance 9.989. The normal deviate is 
3.902 and has a probability of about 0.0001. Hence the deviation of the 

behavior of the series from a not ordered random series is quite marked 
and statistically significant. 

The result of this test should be contrasted with the results of tests 
performed on the same data in connection with periodogram analysis 
(section 9.2, Example 3). All the tests given there had negative results. 

Example 3. We test now the phases of the 23 yearly items of the 
quantity of meat , 1919-41 (section 8.1, Example 2). We have Table 4. 


TABLE 4 


Length of Phase 

Number 


Observed 

Expected 

1 

8 

8.333 

2 

2 

3.483 

3 or more 

2 

1.183 

Total 

12 



Hence we have x P 2 = 1.208. Six-sevenths of this value are clearly not 
significant at the 5 per cent level. Hence we must conclude that there 
is no evidence for cyclical movements in the series of meat prices. 

The total number of completed phases is normally distributed with 
mean 13 and variance 5.65. The probability of the number of empirically 
observed phases (12) is about 0.67. Hence we conclude that there is 
probably no tendency to cycles in the quantity series. 

This result agrees partially with the tests of significance given in con¬ 
nection with the periodogram analysis of the same series (section 9.2, 
Example 2). 



Chapter 10 


The Interdependence of Successive 
Observations 


It has already been observed that the mutual interdependence of suc¬ 
cessive observations is one of the main difficulties in the analysis of 
economic time series. 1 We will present in this chapter some methods 
that have been designed to deal with this difficulty and that are promising 
as far as economic data are concerned. Many of the procedures proposed 
have been developed only recently. Not much experience has yet been 
accumulated regarding their efficiency in the practical application to 
economic data. Only extensive empirical econometric work will enable 
us eventually to choose more successfully between the various methods 
proposed. It should also be mentioned that there are still some very 
important mathematical problems to be solved in this field. 

Various methods dealing with autocorrelation (section 10.1) permit us 
to test this phenomenon, at least approximately. One of the most 
important methods which has been used extensively in newer econometric 
work is the Von Neumann ratio (section 10.2). 

We present also the theory of stochastic difference (section 10.3) and 
differential equations (section 10.4) and the treatment of observations with 
correlated errors by the method of least squares (section 10.5). There is 
an account of correlogram methods (section 10.6). These are very diffi¬ 
cult subjects. Nothing more than some suggestions about the available 
methods can be oflered. The ultimate evaluation of these procedures 
will be largely a question of their success in econometric practice. 

Many of the methods given here are quite new. There are still serious 
shortcomings in this field, especially the almost complete lack of small 
sample tests. But it seems that the methods proposed to deal with the 
interdependence ot observations hold more promise than any other 
Procedures in economic time series analysis. Our goal ought to be 
a sufficiently developed theory of stochastic difference or differential 

G. U. Yule: 'Why do we sometimes get nonsense correlations between 
time series?" Journal of the Royal Statistical Society , vol. 89 (1926), pp. 1 ff. 

239 



240 INTERDEPENDENCE OF OBSERVATIONS [10.1 

equations, which should be a part of a general theory of stochastic 
processes. 2 In spite of great progress we are still very far from a satisfactory 
treatment of these problems. 

Most of the theory of time series deals with stationary series, i.e., series 
without the trend. The probabilities involved are invariant against a 
translation of the time axis. But it seems that both trend and cycle 
ought to be explained by the same stochastic mechanism with economic 
time series. 3 Hence it would be most important to develop the stochastic 
theory of processes which have both evolutionary and oscillatory com¬ 
ponents. 4 

10.1 Autocorrelation 

Many difficulties in the analysis of economic time series arise from the 
fact that the individual items of a time series X l9 X 2 , • • • X N are not 
independent. Hence they cannot be treated as if they were the result 
of random sampling. 1 This makes many of the classical results of 
statistical analysis inapplicable in our case (section 10.5). 

Hence it is of great importance to have tests for the dependence between 
consecutive items of the series X t (t — 1, 2, • • • N). Several such tests 
have been developed and wiil be discussed. 

A non-parametric test will be presented in section 10.1.1. A test based 
on the normal distribution and a circular population is given in section 

10.1.2. Relations between autocorrelated series are presented in section 

10.1.3. 

10.1.1 WALD-WOLFOWITZ NON-PARAMETRIC TEST 2 

A test which has the advantage of great simplicity has been constructed 
by Wald and Wulfowitz. This is an exact test for randomness. It is 
non-parametric, since it is based upon all the possible permutations of 
the actual observations X t (t = 1, 2, • • • A). Hence it is independent 
of the parameters of the probability distribution of the population from 

2 W. Feller: An Introduction to Probability Theory and Its Applications , 
vol. 1 (New York, 1950). 

3 G. Tintner: “A ‘simple’ theory of business fluctuations,” Econometrica , 
vol. 10 (1942), pp. 317 ff. 

4 H. Rubin: ‘‘Consistency of maximum likelihood estimates in the explosive 
case,” in T. C. Koopmans, ed.: Statistical Inference in Dynamic Economic Models 
(New York, 1950), pp. 356 ff. 

1 A. M. Mood: Theory of Statistics (New York, 1950), pp. 126 ff. 

2 A. Wald and J. Wolfowitz: “An exact test for randomness in the non- 
parametric case based on serial correlation,” Annals of Mathematical Statistics , 

vol. 14 (1943), pp. 378 ff. 


AUTOCORRELATION 


241 


10.1 ] 


which our observations may be considered to have been taken. There 
is, for instance, no assumption of normality. 

We define the following quantities: 

(1) S k =2 X* (k =1,2, 3,4) 

t =i 

These are the sums of the powers of our observations. 

The test function is: 

(2) R=2*tX t +i+ X x X x 

t = i 


It should be noted that we have here a circular definition of our test 
function: Each item in our series is multiplied by the one immediately 
following, but the last item X w is multiplied by X v 

The fact that the test is circular may make its application in economic 
time series somewhat doubtful, especially if they have a strong trend. 
Then it may be expected that the term X y X 1 will be very large. It may 
even dominate the value of R. 

Another test based upon the idea of circularity will be presented in the 
next section, 10.1.2. 

The mean of R is: 



The variance of R is: 


(4) V 



4S 1 2 S 2 -f 4S 1 S 3 + S 2 2 - 
{N~— 1 )(N-2) 



- E(R) 2 


These are the mean and the variance if the series is random. 

For large N the test function R is approximately normally distributed 
with mean E(R) and variance o I{ 2 . This provides a large sample non- 
parametric test for autocorrelation. 

Example 1. As an example for the non-parametric test we use the 
autocorrelation of the American wheat-flour prices, discussed in the author's 
book, The Variate Difference Method . 3 We have for the coefficient R 
from formula (2): R = 207.487. Its mathematical expectation is E(R) = 
1765.580 [formula (3)]. Its variance is computed from formula (4): 
o , { 2 = 987.613. 

For large samples R is normally distributed with mean E{R) and 


3 G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940), pp. 
156 fr. 



242 


INTERDEPENDENCE OF OBSERVATIONS 


[10.1 


variance o 1{ 2 . The normal deviate is in our case - 49.6. The associated 
probability is less than 0.00001. Hence the deviation between our 
empirical R and the expected value E(R) has to be considered significant. 
We conclude that it is very unlikely that the series of American wheat-flour 

prices is a random series. There is, as expected, strong autocorrelation 
between successive items. 

Example 2. We consider now the series of American meat consumption , 
1919-41, for autocorrelation. The data are given in section 8.1, Example 
2. The series is annual; we have N = 23 observations. The empirical 
value of R computed by formula (2) is 197,021.29. 

From the data we have also E(R) = 196,037.96 and o J{ 2 = 103,178.3245. 
Hence the normal variate which corresponds to our empirical value is 
about 3.05. From the tables of the normal distribution we have a 
probability of the occurrence of such a large or a larger positive or negative 
value: about 0.002. Hence the deviation is significant at the 1 per cent 
level. It is very unlikely that our price series is a true random series. 


10.1.2 ANDERSON’S TEST FOR AUTOCORRELATION IN A 
CIRCULAR UNIVERSE 4 

R. L. Anderson has defined the autocorrelation coefficient for lag L 
as follows: 



N — 


i£n 



In this formula L C K is the autocovariance for lag L : 


X\ X Ij + 1 -F X 2 X L+2 4- 


-f X N , X L _, 4- X S X L 


( 2 ) jC N = 


N 



The expression in the denominator is the variance: 






2 




It should be noted that we are again dealing with a circular universe. 
Anderson has succeeded in deriving the small sample distribution of the 


4 R. L. Anderson: “Distribution of the serial correlation coefficient, * 
Annals of Mathematical Statistics, vol. 13 (1942), pp. I ff.: Serial Correlation 
in the Analysis of Time Series (unpublished thesis, Ames, Iowa, 1941). 



AUTOCORRELATION 


243 


10 . 1 ] 


autocorrelation coefficient. The values of L R X for various levels of 

significance are tabulated in his paper. 

For large samples L R N is approximately distributed like X R N , i.e., 
the autocorrelation coefficient for lag 1. But X R X is approximately 
normally distributed for large N with mean — l/(/V — 1) and variance 
(N-2)I(N- l) 2 . 

T. W. Anderson has given a very useful table for testing the auto¬ 
correlation coefficient in the circular and in the non-circular case.'’ 

It is unfortunate that the autocorrelation coefficient is based upon the 
circular definition. In economic time series it is frequently not appro¬ 
priate to use the circular definition of the coefficient and it is better to 
compute the empirical coefficient from the non-circular definition: 


(4) = 


*lL x t x t+L— (2^)( 2 x t)l(N-L) 

t= 1 t = 1 t = L + 1 


N-L N-L 

1 x , 2 -( 2 x t ) 2 1 (N— L) 

/ =i / = i 


2 x*-{ 2 x t) 2 IW 

t=L 1 t = L + \ 


L) 


Then we may consider the distribution of the circular coefficient (1) as 
an approximation for the distribution of coefficient (4). It should be 
remembered, however, that coefficient (4) is computed from N — L and 
not from N observations. 

A related test will be discussed in section 10.2. The relationship 
between the two tests has been investigated by T. W. Anderson (i [formula 
(5) of section 10.2], 

A test for the autocorrelation of residuals from Fourier series has been 
given by R. L. Anderson and T. W. Anderson. 5 6 7 


5 T. W. Anderson : “On the theory of testing serial correlation," Skamlinavi.sk 
Aktnarietidskrift , vol. 31 (1948), pp. 88 IT. T. C. Koopmans: “Serial correlation 
and quadratic forms in normal variables," Annals of Mathematical Statistics , 
vol. 13 (1942), pp. 14 ff. W. J. Dixon: “Further contributions to the problem 
of serial correlation,” ibid., vol. 15 (1944), pp. 119 fT. H. Rubin: “On the 
distribution of the serial correlation coefficient," ibid., vol. 16 (1945), pp. 211 fT. 
W. Ci. Madow: “Note on the distribution of the serial correlation coefficient," 
ibid., vol. 16 (1945), pp. 308 ff. R. B. Leipkik: “Distribution of the serial 
correlation coefficient in a circularly correlated universe,” ibid., vol. 18 (1947), 
pp. 80 fT. M. Ogawara: “A note on the test of serial correlation coefficients," 
ibid., vol. 22 (1951), pp. 115 fT. 

6 T. W. Anderson: op. c/7., pp. I 1 5 fT. 

R. L. Anderson and T. W. Anderson: “Distribution of the circular serial 
correlation coefficient for residuals from Fourier series," Annals of Mathematical 
Statistics , vol. 21 (1950), pp. 59 fT. 



244 


INTERDEPENDENCE OF OBSERVATIONS 


[10.1 


Example 1. In economic time series it is frequently not advisable to 

compute the circular autocorrelation coefficient. But we may take 

Anderson’s circular distribution as an approximation to the distribution 
in the non-circular case. 

We give in Table 1 the autocorrelation coefficients for the series of 

American wheat-flour prices , 1890-1937. 8 The number of items in the 
series is N = 48. 

TABLE 1 

Autocorrelation Coefficients for American Wheat-Flour Prices 


Lag 

Non-circular Autocorrelation 

Coefficient 

Number of 
Observations 

L 

r ifl 

N-L 

1 

0.8546f 

47 

2 

0.6854f 

46 

3 

0.5114f 

45 

4 

0.3489-f 

44 

5 

0.3091* 

43 

6 

0.2724* 

42 

7 

0.2820* 

41 

8 

0.2950* 

40 

9 

0.2427* 

39 

10 

0.1471 

38 

11 

0.0096 

37 

12 

- 0.0999 

36 

13 

-0.1564 

35 

14 

-0.1580 

34 

15 

-0.1098 

33 


* Significant at the 5 per cent level, 
t Significant at the 5 and 1 per cent levels. 


We have from Anderson’s table for lag 1, and hence N — 1 = 47 
degrees of freedom: upper 5 per cent significance point 0.214, upper 1 per 
cent significance point 0.308. 

Hence it appears that the autocorrelation with lag 1 is significant at 
the 1 per cent level. 

We have also for lag 2, hence N — 2 = 46 degrees of freedom, the 
following result: Since N is large, we may use as an approximation the 
distribution for lag 1. But there we have by interpolation the upper 
5 per cent significance point as 0.217, and the upper 1 per cent significance 
point appears as 0.311. Hence our empirical result for lag 2 (0.6854) is 


8 G. Tintner: op. c/7., pp. 156 ff. 


10.1] 


AUTOCORRELATION 


245 


highly significant. We proceed similarly for the other autocorrelation 
coefficients. 

We see that the autocorrelation coefficients from lag 1 to lag 4 are 
significant, judged by the 1 per cent level of significance. The auto¬ 
correlation coefficients for lags 5 to 9 are significant on the 5 per cent 

level of significance. The other autocorrelation coefficients are not 
significant. 

A similar result is reached if we use the large sample approximations 
given above. The mean of the autocorrelation coefficient (lag 1) for 
N — 1 = 47 is — 1/46 = — 0.021739, and its variance is 45/(46) 2 = 0.0213. 
Hence we have, as a large sample estimate for the upper 5 per cent point 
of significance, 0.220; for the upper 1 per cent point 0.320. It is remark¬ 
able that the large sample approximations are very close to the correct 
values given by Anderson. 

The application of the large sample tests confirms the conclusions 

indicated above. The price series investigated shows very strong auto¬ 
correlation. 

Example 2. We give in Table 2 non-circular autocorrelation coefficients 
of the quantity of meat consumed in the United States, 1919-41. The 
data are deviations from a cubic trend. The computation of this trend 
was discussed in section 8.1, Example 2. There are N = 23 yearly 
observations. J J 

TABLE 2 

f 

Quantity of Meat 



Non-circular 

Autocorrelation 

Coefficient 

r L * 

0.0895 

- 0.3513 

- 0.1779 

- 0.1876 

- 0.2724 

- 0.0154 
0.2382 
0.0211 
0.3328 
0.3141 

- 0.3389 


Number of 
Observations 

N - L 

21 

21 

20 

19 

18 

17 

16 

15 

14 

13 

12 


Per Cent Level 
of Significance 

0.289 

- 0.393 

- 0.399 
-0.412 

- 0.425 

- 0.438 
0.323 
0.328 
0.335 
0.341 

- 0.516 


P r o , ‘ n A Table 2 also the 5 P er cent Hmits. taken from the publi¬ 
cation of R. L. Anderson. These values are of course only approximate 

since we have not computed a circular autocorrelation coefficient 



246 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.1 


We see that none of the empirical coefficients is statistically significant. 
Hence we may conclude that the series of the quantity of meat consumed 
behaves very much like a random series. It may be surmised that the 
cubic trend contains all the autocorrelation which fails to show up in 
our series of deviations from this trend. 

Example 3. We present now the autocorrelation coefficients of the 
American wholesale price index of all commodities , 1890-1947. The data 
are deviations from the trend. The trend is itself a moving average of 
variable length. The computation of the trend is to be found in section 
8.2.1, Example 1. There are N = 52 items. 


TABLE 3 


Non-circular 
Autocorrelation 
Lag Coefficient 


Number of 5 Per Cent Level 

Observations of Significance 


L 

rjf 

1 

- 0.0477 

2 

- 0.1032 

3 

0.0737 

4 

- 0.3814 

5 

- 0.0939 

6 

0.1273 

7 

0.1312 

8 

0.6738 

9 

- 0.0438 

10 

0.1141 

11 

- 0.0000 

12 

- 0.0568 

13 

- 0.1356 

14 

- 0.2036 

15 

- 0.0832 

16 

0.0056 

17 

0.3567 

18 

0.1999 

19 

0.0010 

20 

- 0.0054 


TV — L 

51 

50 

49 

48 

47 

46 

45 

44 

43 

42 

41 

40 

39 

38 

37 

36 

35 

34 

33 

32 


- 0.246 

- 0.248 
0.210 

- 0.242 

- 0.238 
0.216 
0.218 
0.220 

- 0.256 
0.225 

- 0.274 

- 0.279 

- 0.275 

-0.271 

- 0.268 
0.239 
0.242 
0.245 
0.248 

- 0.315 


We give in Table 3 the empirical autocorrelation coefficients. We 
present also the number of observations which have been actually used 
for the computation of a given coefficient. The last column gives the 
value of the circular autocorrelation coefficient for the 5 per cent level 
of significance, taken from the tables of R. L. Anderson. We see that 
the majority of the autocorrelation coefficients are not significant at the 



10.1] 


AUTOCORRELATION 


247 


5 per cent level. The autocorrelation coefficients for the following lags 
are, however, significant: 4, 8, 17. Our index numbers (deviations from 
the moving average) cannot be considered to be a random series. 

10.1.3 RELATIONS BETWEEN AUTOCORRELATED SERIES 

Autocorrelation is extremely common with economic time series. 

Hence it would be very convenient to have some valid tests which would 

enable us to form an idea about the relationship between two autocorre- 

lated series as it exists in the population. Unfortunately nothing is 

available but some approximative large sample tests which are not too 
reliable. 

M. S. Bartlett 9 has given some useful formulae for large sumple tests 
in this field. Let us assume that we have two series of N terms. The 
autocorrelation coefficients for lag I are for these series: r, and 
Assume that the true correlation between these series in the population 
is zero, so that the two series are not correlated. Then the variance of 
the empirical correlation coefficient between them is: 


(1) 


I 


r i r i 


A(l r t r.') 


This is a large sample approximation and only valid in a special case. 
Only the leading terms of an expansion have been retained. 

It is evident that an autocorrelated series of a given length corresponds 
roughly to a pure random series of a shorter length. G." H. Orcutt and 

,, ’ „ CS have on lhe basis ° r a very elaborate sampling study devised 

the following heuristic method of testing the correlation between auto- 

correlated but (in the population) uncorrelated and not serially correlated 
ime series. It should be recalled that autocorrelation is a correlation 
a senes with itself, lagged by some time units. Serial correlation is 
the correlation of one series with another lagged series. 

Orcutt and James estimate the variance V of the correlation coefficient 

to^esK Hr Bart !f t: “ S ° me aSpeCtS ° f ,he ,ime correlation problem in reeard 
PP 536 fT tri Journal of,he Royal Statistical Society, vol. 98 (1935), 

correlated ’" d °f au«o- 

Quenouille: -Notes on the’ c^ulO^ OT^alioO JhLrOu 
“Some'th SChemeS> Blo » letr 'ka. vol. 34 (1947), pp . 365 ff. P. A P Moran- 

19 G H°n mS ° n tlme SerieS ’" ihi ‘ L VGl 34 (l947) ’ PP- 281 ff- 

between UwserL* B ^ F ' ‘ he si g nifican ce of correlations 

een time series, Biometnka, vol. 35 (1948), pp. 397 ff. 



248 


INTERDEPENDENCE OF OBSERVATIONS 


[10.1 


between two series, whose first sample autocorrelation coefficients 
r x and rf; 



(1 +r l r l t ) 2r 1 r 1 '(l — r l N r 1 ,N ) 

m - Y/) N\\-r x r x y 


are 


If K < 0.25, the effective number of observations ri is estimated from 
the formula: 




1 



1 


The number of effective observations ri is then used for entering the 
tables for tests of significance of the simple correlation coefficient for 
unrelated series. 11 

P. A. P. Moran 12 proposes a test for the covariance between two auto- 
correlated time series. Let the series be written in the form: 

( 4 ) X, = f a, 

t = 0 


( 5 ) y t = 2 Mi-.- 

i = 0 

Here e t and ?] t are random series. They have mean zero and variances 
a 2 and r 2 . They are not autocorrelated, not correlated, and not serially 
correlated. 

We denote the autocovariance of x t by: 

(6) Ex t x t _ s = c s 
The autocovariance of^< is: 

( 7 ) fytyt-s = d* 


The covariance C of x ( and y t has the following property: 




N 

2 *tyt 


t =i 

N 


Moran proves in his paper that under certain conditions for large 


11 R. A. Fisher and F. Yates: Statistical Tables for Biological , Agricultural 
and Medical Research (London, 1938), p. 36. G. W. Snedecor: Statistical 
Methods (4th ed., Ames, Iowa, 1948), p. 149. 

12 P. A. P. Moran: op. c/7., pp. 281 ff. 


10.1] 


AUTOCORRELATION 


249 


samples this quantity has mean zero and is in the limit normally 
distributed with the variance: 


OO 


(9) 


+ 2 1 c,d, 

y — _ *= i 

N 


This method will be illustrated in section 10.3.1. 

Example 1. We consider the yearly American series of corn prices 
and the yearly series of corn stocks , 1926-40 (N = 13). We have for 
the prices for the first order autocorrelation coefficient r x = 0.369, and 
for the stocks r x = 0.734. 

The empirical correlation coefficient between the prices and stocks is 
0.898. We use Bartlett’s large sample formula in order to test it. The 
variance of the correlation coefficient is [formula (1)]: 

1 + (0.369)(0.734) 


( 10 ) 


(13)[1 — (0.369)(0.734)1 


= 0.134070 


The ratio of the empirical correlation coefficient to its standard error 
(square root of the variance) is 2.452. This is certainly significant at the 
5 per cent level, where we have for the normal distribution only a per¬ 
missible ratio of 1.96. Hence there is in all probability a definite relation¬ 
ship between prices and stock of corn in the population which corresponds 
to our sample. 

This corresponds of course to our expectations. But it should be 

remembered that the test used is a large sample test and our series are 
rather short. 

Example 2. We use the same data (corn prices and stocks) as in the 
previous example in order to illustrate Orcutt’s method of testing the 
relationship between autocorrelated series. We have from our data for 
the variance of the correlation coefficient [formula (10)]: 

1 + (0.369)(0.734) _ (2)(0.369)(0.734)[l -(0.369) 13 (0.734) 13 ] 

(13)[1 —(0.369X0.734)] (13) 2 [I — (0.369)(0.734)] 2 

= 0.128041 

This is less than 0.25 and hence we may use Orcutt’s method 
From relationship (II): 

< 12 ) 0.128041=—!_ 

n — 1 

we estimate n = 9.8 as the number of degrees of freedom to use for test 
Hence we may use + = 10. This shows that our two samples of 13 items 
are really in a sense equivalent to random series consisting of only 10 items. 


(ii) y= 



250 


INTERDEPENDENCE OF OBSERVATIONS 


[10.1 


The empirical correlation coefficient between the two series is 0.898. 
But for 10 degrees of freedom we may have at the 5 per cent level a 
coefficient as large as 0.576, at the 1 per cent level a correlation coefficient 
as large as 0.708, and at the 0.1 per cent level a correlation coefficient of 
0.823. Hence our empirical coefficient is significant. 

Orcutt’s test leads also to the conclusion that stocks and prices of corn 
are probably correlated in the population which corresponds to our 
sample. This result corresponds indeed to our expectations. 


10.1.4 AUTOCORRELATION OF RESIDUALS 

In econometric work we are frequently interested in the autocorrelation 
of residuals from regression equations. A large sample test for the 
autocorrelation of residuals has recently been given by P. A. P. Moran. 13 

Suppose that we have a linear regression between a dependent variable 
Y and an independent variable X: 

(1) Y=a + bX 


The least squares estimates of a and b are denoted by a* and b* (section 
5.1). The residuals are: 

(2) e t = Y t — a* — b*X ( (t = 1, 2, • • • N) 


We want to test the circular autocorrelation coefficient of the residuals 


£t- 


( 3 ) 


*i = 


X-l 

Z e t e t +1 + € 1 e N 

i = 1 


X 

Z *t 2 

t= 1 

[see formula (1), section 10.1.2]. 

Let x t be the deviation of the independent variable X t from its 
arithmetic mean. Then we define the two first autocorrelation coefficients 
of the x t again in a circular fashion: 

N- 1 

2 * 1 * 1+1 + 


(4) 


r l = 


t = 1 


(5) 


r o = 


N 

( = 1 

X- 2 

Z X t*t+2 + *1*Y—1 + X 2 X N 

t= 1 __ 

y 

I -V, 2 

t = 1 


13 P. A. P. Moran: “A test for the serial independence of residuals, Bio- 
metrika , vol. 37 (1950), pp. 178 ff. J. Durbin and G. S. Watson: “Testing for 
serial correlation in least squares regression," ibid., vol. 37 (1950), pp. 409 ff. 


10.1] 


AUTOCORRELATION 


251 


Moran shows now that the mathematical expectation of the auto¬ 
correlation coefficient of the residuals R l is: 

-d +/S) 


( 6 ) 

Further: 

(7) 


E(R,) = 


N -2 


E(Rc) = 


/V 


yv 2 


I 2r, + 3^2 - 2r 2 

r — 


n(n - 2 ) 


The variance of R l is given by: 


( 8 ) 


C7 / , i 2 = £(/ ? 1 2)_ [£(y?i)] 2 


it is also shown that lor large samples the quantity: 
(9) /?! — E(R } ) 


a 


it . 


is normally distributed with mean zero and variance 1. 

Example 1. We use Moran's method for testing the residuals from a 
regression of the quantity of meat consumed in the United States (Y) 

?Q,Q h !, PnC l°[ meat thg Umted Stales (X) - The P eriod covered is 
1919-41. We have /V = 23 annual data. 

The least squares estimate of relationship (1) is: 


( 10 ) 


Y = 186.7758 - 0.222923 


The empirical first-order autocorrelation coefficient of the residuals 
computed with the help of formula (3) is R x =, 0.657360. The circular 
autocorrelation coefficients of the price series (X) are computed from 
formulae (4) and (5): /•, = 0.534792 and r 2 = 0.018514. 

^ Hence we have for the mathematical expectation of /f, from formula 


( 11 ) 


£<*i) = 


- (I + 0.534792) 


21 


= - 0.073085 


Further we obtain from formula (7) 

(12) £^2)- 24 , [(2X0.534 792) 

23 2 

= 0.008451 


(3)(0.53 4792)2- (2)(0.018514)1 
(23X2 V) 


Hence we have from formula (8) for the variance of R ,: 


o 1 ,, - 0.003110 


1(. 


Quantity (9) is (0.657360 + 0.073085)/0.055767 = 13 098 This is 
certainly significant, if we use the normal approximation Hen« we 



252 


INTERDEPENDENCE OF OBSERVATIONS [ 10.2 

nmy conclude that there is definitely a positive autocorrelation between 
the residuals from the regression equation (10). 

It is, however, somewhat doubtful whether our series of 23 items is 

really long enough to permit us the use of the large sample test. No 
small sample test is yet in existence. 


10.2 Ratio of Mean Square Successive Difference to the 
Variance 

A method for testing the independence of successive observations in a 

given senes is the ratio of mean square successive difference to the variance 

We have a series of N observations: X l9 X 2f • • • X y . The mean square 
successive difference is: 

'liXt+i - *t ) 2 

(1) <5 2 =‘=i__ 

N- 1 

This quantity is closely related to the variance of the first differences of 
the series. This will be discussed in section 11.2. 

The variance is defined as: 

I (AT, - X) 2 

( 2 ) *2 = t=i _ 

N 


In formula (2) X is the arithmetic mean of the series. 
The ratio required is: 



Its distribution has been established by Von Neumann and others, 1 
under the assumption that the series X t is not autocorrelated. The 
sample also comes from a population which follows a normal distribution. 
There is now a table of significance levels available. 2 We make tests of 
significance for the null hypothesis that the series is a pure random series. 


1 J. Von Neumann: “Distribution of ratio of the mean successive difference 
to the variance," Annals of Mathematical Statistics, vol. 12 (1941), pp. 367 ff.; 
“A further remark on the distribution of the ratio of the mean successive 
differences to the variance," ibid., vol. 13 (1942), pp. 86 ff. B. I. Hart: “Tabu¬ 
lation of the probabilities for the ratio of the mean square successive difference 
to the variance," ibid. , vol. 13 (1942), pp. 207 ff. 

2 B. I. Hart: “Significance levels for the ratio of the mean square successive 
difference to the variance," Annals of Mathematical Statistics , vol. 13 (1942), 
pp. 445 ff. 



10.2 ] 


VON NEUMANN RATIO 


253 


According to this hypothesis, all the items are normally distributed and 
independent. There is no autocorrelation between the successive items 
of our hypothetical time series. 

It has been shown that for large samples, i.e., for A large, the distribu¬ 
tion of the ratio tends to the normal distribution. Its mean is 2 N/(N — 1) 
and its variance is: 

( 4 ) 4NHN-2) 

(A - 1 ) 3 ( A + 1) 


The advantage of the ratio over the tests given in the preceding section 
10.1 for autocorrelation is the following: All tests given there assume a 
circular universe, i.e., a universe in which the observation number A -f- L 
can be identified with observation /,, if there are altogether A observations. 
For large samples this assumption may be trivial and hence the tests may 
be used as approximations. But the circular definition of autocorrelation 
is assuredly not applicable for short time series with pronounced trends. 
Terms like the product of the last and the first observation may give a 
very misleading idea about the value of the autocorrelation in the popula¬ 
tion which corresponds to our sample. 

There is no assumption about circularity with the Von Neumann ratio. 
T. W. Anderson 3 has shown that there is a definite relationship between 
the first circular autocorrelation coefficient r x and the ratio: 

(5) d l = 2N ( l ~ r i) 

s 2 A - 1 


It is also easy to show a relationship between the ratio and the variances 
of finite differences. 4 We have evidently: 



d 2 _ 2NV X 

s 2 ~ (iV— \)V 0 


where V„ is the variance of the original data and V x is the variance of the 
first difference series of the observed series. These quantities will be 
defined in section 11.2, dealing with the variate difference method. 

One disadvantageous feature is the appearance of the mean 5 in the 


■' T. W. Anderson: "On the theory of testing serial correlation 
Aktuarietidskrift , vol. 31 (1948), pp. 115 . 

1 G. Tintner: The Variate Difference Method (Bloomineto 
pp. 139 ff. b 


Skandinavi.sk 
Ind., 1940), 


D. G. C hampernowne: “Sampling theory applied 
quences," Journal of the Royal Statistical Society vol 
logical), vol. 10 (1948), pp. 204 ff 


to autoregressive se- 
I, series B (methodo- 



254 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.2 


formula, since we need the mean in order to compute the quantity j 2 

l . h L v * nance of the original series of observations [formula (2)].’ 
This difficulty could be avoided if we used instead of the ratio, for instance 
the quantity V 2 /V u where V 2 is the variance of the series of the second 
differences ot the data and V l the variance of the series of the first differ¬ 
ences. The small sample distribution of this quantity is not known. 

A large sample distribution of a related quantity will be presented in 
section 11.2.2. 

Example 1. We take as our example the series of annual wheat-flour 
prices, 1890-1937. 6 There are 48 items; hence N = 48. 

We have for the variance of the original series s 2 = 225.456/48 = 4.6970 

[formula (2)]. For the mean square successive difference we obtain 
& = 65.445/47 = 1.3924 [formula (1)]. 

Hence we have finally for the ratio d 2 /s 2 = 0.296. But, according to 
the tabulated levels of significance, we have for the 5 per cent Ievef for 
A = 48 a permissible ratio of 1.57. For the 1 per cent level of significance 
the permissible ratio is 1.38. Hence the deviation is significant at the 
1 per cent level of significance. It is extremely unlikely that the series 
of annual wheat-flour prices is a random series. 

We may also use the large sample approximation, since 48 is a reason¬ 
ably large number. Then the ratio is in the limit normally distributed 
with mean (2)(48)/47 = 2.0426. The variance is (4)(48) 2 (46)/(47) 3 (49) = 
0.0833. This quantity is computed from formula (4). The probability 

of our empirical ratio or a larger ratio is less than 0.0001. Hence the 
ratio is certainly significant. 

The conclusions indicated above are confirmed by the large sample 

test. Our results coincide with the autocorrelation test presented in 
section 10.1.2, Example 1. 

Example 2. Let us now analyze the yearly series of meat prices , 
1919-41. The data are presented in section 8.1, Example 2. The 
variance of the original data is s 2 — 68.761. The variance of the first 
difference series gives d 2 = 60.859. Hence the Von Neumann ratio is: 



d 2 

- = 0.885 

.v- 


We have /V = 23. The permissible value for the ratio at the 5 per 
cent level of significance is 1.4035, at the I per cent level 1.1456. Hence 
our empirical value is significant at the 1 per cent level. 

For the large sample approximation we have the proposition that the 


,J G. Tintner: op. cit. y pp. 40 ff. 



10.3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


255 


ratio is approximately normally distributed with mean 2.091 and variance 
0.174. Hence the probability of our empirical ratio is about 0.002. 

Both tests lead us to the conclusion that it is very unlikely that the 
series of meat prices is a random series. 

10.3 Stochastic Difference Equations (Linear Autoregression) 

Stochastic difference equations are formed from ordinary difference 
equations by adding stochastic or error terms. We will treat only linear 
equations. Such stochastic processes are also called schemes of linear 
autoregression (section 10.6.2). 

We discuss in this section various methods for handling these problems. 
Section 10.3.1 deals with first-order equations; section 10.3.2 with second- 
order equations. In section 10.3.3 we present methods for dealing with 

one single difference equation of arbitrary order. Section 10.3.4 treats 
systems of difference equations. 

The well-known problem of lag correlation or serial correlation is treated 
in section 10.3.5. Errors in the variables are introduced in section 10.3.6. 

Ideas relating to process analysis, well known in modern economic theory, 
are also relevant (section 10.3.7). 

The application of correlogram methods to stochastic difference 
equations will be discussed in section 10.6.2. 

10.3.1 FIRST-ORDER DIFFERENCE EQUATIONS 

A first-order linear stochastic difference equation is of the following 
form: 1 t 


( 1 ) 


— aX t 


—f 


/+! 


where we assume the random variable t, to be non-autocorrelated with 
mean zero and variance cr 2 ; a and b are constants. There is a single lag 
of one time unit. We also impose the condition: 


( 2 ) 


X 0 0 


The general term of the series X, is given by: 

(3) 

We have evidently: 

(4) 


X* = --) b d- 2 a‘ 

' 1 - a > .=i 


8 


EX, = 


■ - £3 


Iq-iq. ' ^ u ' S X m ' he Ana, y sis °f Stationary Time Series (Uppsala, 

sl'Lr , , r rW,CZ: . ^f aSt $quareS b ' as in time series >” in T - C. Koopmans, ed.: 
Statistical Inference in Dynamic Economic Models (New York, 1950X pp. 365 ff. 



256 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.3 


We define: 


(5) x t = X t — EX, 

This is the deviation of X, from its mathematical expectation (4). 
Then we have the variance of X,\ 



For large t we have 

(7) 


lim Ex, 2 = —-— 
* l-a 2 


This will not be infinite or negative, if \a\ < 1. 
The autocovariance of X, and X t+L is given by 



We have again for large /: 



lim Ex,x t+L = 

t —►oc 


a 2 a L 



Hence we have for the autocorrelation between X, and X t+L : 

( 10 ) _ <* 1 —") 

\ () V(I - o”XI - a” • ”-) 

For large t this becomes: 

(11) lim /-£ = 

/ —► oo 

This is a decreasing exponential for 0 < a < 1 and a damped harmonic 
oscillation for — 1 < a < 0. The empirical correlogram may, however, 
not show this phenomenon very clearly. Bartlett has shown that the 
consecutive empirical autocorrelation coefficients are themselves very 
highly correlated. 2 

The correlogram is a graph of these autocorrelation coefficients. It 
follows from Bartlett's result that the appearance of the correlogram may 
depend more upon the empirical autocorrelation coefficients in the sample 
than upon the underlying stochastic scheme (I) which holds for the 
population. 


2 M. S. Bartlett: “On the theoretical specifications of sampling properties 
of autoregressive time series," Journal of the Royal Statistical Society , voi. 8 
(1946), supplement, pp. 1 13 flf. 



10.3] 


STOCHASTIC DIFFERENCE EQUATIONS 


2S7 


It is of some interest to examine the case where in equation (1) we have 

a = 1, b = 0: This idea underlies Orcutt’s method to be described in 
section 11.3. Then we have simply: 

02) X l+l = X t + e <+1 

The solution is: 

(13) = K 

S= 1 

so that X t is the sum of a random series. We have evidently EX, = 0. 
The variance is: 

(14) EX, 2 = to 2 

This equation shows that the variance of X, increases indefinitely with 
time. The stochastic process (12) is not stationary. 

We have for the autocovariance: 

05 ) EX t X, +L = to* 

and for the autocorrelation coefficient: 


(16) 

Evidently we have: 



t 

Vt(t -f- L) 


(12) lim r L = 1 

(—>oo 

This shows that in the limit the correlogram is 1. 

As an application of the more general scheme (I) let us apply P. A. P. 

Moran s 3 method for testing the independence of two unrelated auto¬ 
regressive series in large samples (section 10.1.3.). 

Let e, and r\, be two pure random series with mean zero and variances 

o- and r 2 . Neither is autocorrelated, and there are no serial correlations 
between them. They are also independent. 

The two series are defined by the following first-order difference equa- 
tions: ^ 


(18) 

X t — Ax t _ 1 -j- € 

(19) 

}’t = By,_ i -f- r , 

where | A \ ■ 

< 1, |* <1. 



P. A. P. Moran: “Some theorems on time series 

(1947), pp. 281 nr. 


h Biometrika , vol. 34 



258 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


The autocovariance between x t and x t _ s is: 


( 20 ) 


r A s o 2 

EXfX t _, = -- = c 

1 * * 1 - A 2 ‘ 


(s = 0 , I, 2 , • • •) 


The autocovariances of y t are: 


( 21 ) 


E yt}'ts = 


B s r 2 

1 - B 2 ~ ds 


(s = 0, 1, 2, • • •) 


Hence we obtain from the formula given above [section 10.1.3, formula 
(9)] the following expression for the large sample variance of the co- 
variance C of x t and y t : 

oc 




<Vo + 2( 2 C s d s ) 

_ 8 = 1 

N 


o 2 t 2 ( 1 + AB) 

N( 1 - A 2 )( 1 - B 2 )( 1 - AB) 


It is apparent that for A = B = 0 this becomes the variance of the 
covariance of two unrelated series which are also not autocorrelated. 4 

For large samples, the covariance C will be normally distributed with 
the variance V given above [formula (22)]. 

In practice, we will proceed as follows: We will estimate A from the 
empirical series .y, by the first autocorrelation coefficient: A = r l9 or by 
the method of least squares. The variance of the errors e t can be com¬ 
puted from: 

N 



Alternatively we might estimate the variance by formula (7). 

An analogous method can be applied for the estimation of B and r 2 . 
These estimation methods are consistent and may work reasonably well 
with large samples. We may, however, doubt their efficiency for small 
samples. 5 In economics we have only seldom series of more than 
moderate length. 

Example 1. We use the yearly series of corn prices , 1926-40, in order 
to illustrate a first-order stochastic difference equation. We have /V = 14 
yearly observations. 

Equation (1) becomes: 

(24) X t+l = 0.369 X t -f 41.5 

This estimate is computed by the method of least squares. Hence 


4 C. E. Weatherburn: A First Course in Mathematical Statistics (Cambridge, 
1947), pp. 141 ff. 

5 L. Hurwicz: op. cit. 



10 . 3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


259 


a = 0.369. We estimate the variance a 2 from (7) (the variance of X, is 
5637.53): 

( 25 ) a 2 = 4869.918 

In Table 1 we give the higher autocorrelation coefficients computed 
under the assumption that our series follows indeed the first-order sto¬ 
chastic difference equation (24). This table shows the heavy damping 
of the autocorrelation coefficients. It should be recalled, however, that 

the estimate of the constant a is not very accurate because of the shortness 
of our series. 

TABLE 1 


Lag L 
1 

2 

3 

4 

5 

6 

7 

8 


Autocorrelation Coefficient r L 

0.369 

0.136 

0.050 

0.019 

0.007 

0.003 

0.001 

0.000 


Example 2. We consider next the 14 
of corn in the United States, 1926-40. 
equation (1): 


yearly observations of the stocks 
Here we have for an estimate of 


(26) X, + 1 = 0.734A', -f 57.7 
This is again a least squares estimate. 

The variance of the original series is 235,476.80. Hence from formula 
( ) he estimated variance of the random element is: 

(27) o 2 = 108,612.261 

Table 2 shows some of the higher autocorrelation coefficients, assuming 

TABLE 2 


Lag L 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 


Autocorrelation Coefficient r L 

0.734 
0.539 
0.395 
0.290 
0.213 
0.156 
0.115 
0.084 
0.062 
0.045 



260 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.3 


that our series follows the first-order difference equation (26). It should 
be remembered that the estimates given in the table are strictly valid only 
in large samples and that our sample is small. 

Example 3. We use the series of American corn prices and the series 
of stocks of corn, 1926-40, in order to illustrate Moran’s test. We have 
N = 13. The sample covariance between prices and stocks which we 
want to test is C = 2516.038. 

Let x be the series of prices and y the series of stocks. Then we have 
in our present notation A = 0.369, B = 0.734. Also, our estimates of 
the variances are a 2 = 4869.918 and r 2 = 108,612.261. 

We have from these data from formula (22) for V, the variance of the 
covariance of prices and stocks (C): 



(4869.918)( 108,612.261)[ 1 + (0.369)(0.734)] 
(13) [1 — (0.369) 2 ] [ 1 — (0.734) 2 ][! - (0.369)(0.734)] 


177,978 


In the limit for large samples C is normally distributed with mean zero 
and the given variance. The ratio of the empirical C to its standard 
error (square root of V) is 5.96. Hence it is certainly highly significant. 

We conclude that, assuming first-order stochastic difference equations, 
we must consider the prices of corn and the stocks of corn as variables 
which are probably correlated in the population which corresponds to 
our sample. 

This result corresponds to our expectations from general economic 
considerations. It should be remembered, however, that our estimates 
of the autoregression coefficients and of the variances of the random 
elements are valid only in large samples. The covariance of the prices 
and stocks is only normally distributed if the sample is large, which is 
not the case with our data. Hence the reliability of our test is not very 

great. 

10.3.2 SECOND-ORDER DIFFERENCE EQUATIONS 
We consider the second-order linear difference equation: 6 

(1) *,+ 2 + 1 + bx t = e t +2 


6 M. G. Kendall: “Oscillatory movements in British agriculture. Journal of 
the Royal Statistical Society , vol. 106 (1944), pp. 91 ff., On auto regressive 
time series,” Biometrika , vol. 33 (1944), pp. 105 ff.; “Contributions to the stu v 
of oscillatory time series,” National Institute of Economic and Social Resear 
Occasional Paper 9 (Cambridge, 1946); The Advanced Theory of St at is 
vol. 2 (London, 1946), pp. 414 ff. See also G. U. Yule: “On a method of 
investigating periodicities in disturbed series, with special appica ion 
fert*s sun spot numbers,” Philosophical Transactions of the Roya ocie >, 


10 . 3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


261 


where e t is a non-autocorrelated random variable with mean zero and 
variance a 2 ; a and b are constants. The solution of (1) is: 


( 2 ) 

if x 0 = x\ = 0. 


*t = 2 
« = 0 


We have to distinguish two cases according to the sign of a 2 — 4b: 
(a) If a 2 4b > 0, then *he solution is: 

(/>«+' _ 1) 

P- Q ~ 


(3) 

", = 

where: 


(4) 

P = - 

(5) 

0 = Z 


( 6 ) 


(7) 


a - Va 2 - 4b 
2 

The variance of x, is: 

(!+/»• Q)c t 2 

7 >2 )< i - e 2 x i - 

This will be positive and finite if P 2 < 1, Q 2 < |, p q = f, < , 
The autocorrelation coefficients of the generated process x, are 

/*(! - Q 2 )P‘- 


Ex. 2 = - 

(I 


r L = 


T 


0(1 - P~)Q 


h 


(P- 0(1 + PQ) (Q-P)( 1 

This is the sum of two decreasing exponentials. 

(/() Now assume that a 2 - 4b < 0. We have: 


PQ) 


( 8 ) 


", — —77 =- />* sin vs 

A I O 1 


(9) 


( 10 ) 


V4b - a 2 

p = V* 


— a 

cos v =- 

2 Vb 


*» - P»™«W. in : , 2 X T ° K "*" ; " Th ' 

supplement, pp. 44 ff. H. Wold Vo , Econo '" clnc “’ vol. 17 (1949), 

Series (Uppsala, 1938), pp. llOff. “‘ } A " alvs,s °f Stationary Time 



262 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


We define: 



1 T~ b 

tan w = -- tan v 

1 — b 


( 12 ) 


The variance of the generated series x t is: 

(l + b)a 2 


Ex- 


(1 ~b) {(1 + 6) 2 -« 2 } 


The autocorrelation coefficients are: 


(13) 


r L = 


p L sin (Lv + w) 


sin w 


Hence the correlogram of the stochastic process (1) is in this case a 
damped harmonic. 

Because of the result of Bartlett 7 indicated above, the correlogram of 
an empirical series may fail to indicate this feature. 

The mean distance between peaks has been given by M. G. Kendall 
in the brochure cited above. 8 The use of this particular statistic does 
not seem to be a very promising method in this field. 

A difference equation of form (1) without the random element e t may 
give rise to sinusoidal fluctuations. Hence the stochastic process (I), 
which is rather simple, has been used very frequently in econometric 
research. Difference equations of higher order will be discussed in 
section 10.3.3. 

Example 1. As an example of a second-order difference equation we 
consider the relationship derived from the series of American meat 
consumption , 1919-41. The data are yearly deviations from a cubic trend 
(section 8.1, Example 2). 

The estimated relationship is: 

(14) jc, +2 - 0.12191 Sx t+l + 0.36221O.v, = e M 

This relationship has been estimated by correlogram methods (section 
10.6.2, Example 2). Hence we have # = — 0.121918, b = 0.362210. 
We have also, from formulae (9) and (10) given above, p = 0.601830, 
cos v = 0.100957 and hence v = 84° 12'. The general formula for u, 

(8) is: 

(15) u s = (1.670150)(0.601830)* sin [(84° 12') • s] 


7 M. S. Bartlett: op. cit. , 

8 M. G. Kendall: “Contributions to the study of oscillatory time series. 

National Institute of Economic and Social Research , Occasional Papei 9 ( am 
bridge, 1946), pp. 55 ff. 


10.3 ] STOCHASTIC DIFFERENCE EQUATIONS 

We give in Table 1 the values of u s for j = 1 to s = 5. 


263 


TABLE I 

Us 

1.000 
0.609 
— 0.111 
— 0.108 
0.066 

The variance of the series x t is given as: 

(16) Ex f 2 = 1.160or 2 = 23.833 

Hence a' 1 = 20.546. We have also tan = 21.026799 and hence w = 
87° 17' [formula (1 1)]. 

The general formula for the autocorrelation coefficients is: 

(17) r L = (1.001121 )(0.601830) 7 ' sin [(84° 12') • L -f 87° 17'] 

We present in Table 2 the autocorrelation coefficients for lags 1 to 5, 
assuming the validity of the stochastic scheme (14). 


s 

1 

2 

3 

4 

5 


Lag L 

1 

2 

3 

4 

5 


TABLE 2 

Autocorrelation Coefficient r r 

0.089 

- 0.361 

- 0.039 
0.121 
0.043 


The reliability of these results should not be exaggerated. It is by no 

means certain that the deviation of meat consumption from a cubic trend 

follows a stochastic difference equation of the second order and not a 

more c om p hca ted stochastic scheme, for instance, a difference equation 

o a higher order. The problem of the appropriate order of the difference 

equation will be considered in section 10.3.5. The application of the 

methods gwes consistent results. Perhaps these are, however, not too 

reliable because of the shortness of our series. The existence of errors 
Ot observations has also been neglected. 

dfor X r P w \ We g l Ve here 3 stochastic difference equation of the second 
dam W ‘ Ch ‘ S Sat, f d by the SerieS of A merican wholesale prices. The 
PoLT d ^' atlons u from a movi "g average (see section 8.2.1, Example 1). 

Exampi n 3) reSU,t ^ “ C ° rrel °8 ram anal y sis < see section 10.6.2, 



264 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


The estimated relationship is as follows: 

(18) *, +2 + 0.052785;t (+1 + 0.105744*, = e (+2 

We have from formula (9) the estimate of p = 0.325576; from (10): 

(19) cos v = -0.081064 and v = 94° 39' 

Further we derive from (8) the general formula for u„: 

(20) u s = 3.086067(0.325576)’ • sin (94° 39' • s) 

The various quantities u s are shown in Table 3. 


s 

1 

2 

3 

4 

5 


TABLE 3 


“s 

1.000 
- 0.053 
-0.103 
0.011 
0.010 


We have for the estimated variance of the series x t from formula (12): 

£ 2 = _(1 + 0.105744)cr 2 _ 

(21) (1-0.105744) {(1.105744) 2 - (0.052785) 2 } 

= 1.014361a 2 = 37.733 


Hence a 2 = 37.199. 

We have also tan w = — 15.202718 and w = 93° 46'. 
autocorrelation formula is: 



(0.325576) l sin [(94° 39')L -f 93° 46'] 

0799784 


The general 


We give in Table 4 the autocorrelation coefficients for lags 1 to 5; 
they are computed under the assumption of the validity of relationship 
(18): 

TABLE 4 


Lag L 
1 

2 

3 

4 

5 


Autocorrelation Coefficient r L 

- 0.046 

- 0.103 
0.01 1 
0.010 

- 0.002 


The very rapid damping with increasing lag is apparent. 

The reliability of our results is perhaps again not too great because of 
the reasons mentioned in connection with Example 1. 


10.3] 


STOCHASTIC DIFFERENCE EQUATIONS 


265 


10.3.3 ONE SINGLE DIFFERENCE EQUATION OF ARBITRARY 
ORDER 9 


We assume a stochastic difference equation of order p : 

v 

= Z a s^t-s + a o ~r £ t (t = 1, 2, • • • N) 
8=1 


The random variables e t are not autocorrelated. Their mean is zero and 

their variance is a 2 . Equation (1) is also called a general scheme of linear 
autoregression. 

Let us assume that all the roots of the characteristic equation 



y p - 2 <*,y 


V 8 


# = 1 



are less than 1 in absolute value. Then the process is stationary. There 
is no evolutionary component, no secular trend. All probabilities are 
invariant against a translation of the time axis. 

The method of maximum likelihood leads to the following estimates: 


(3) 


* V H y 

1 X t X,_, -22 a t X t . t X t _ t - a 0 2 X,_, = 0 

i=i(=i (=■ 


t = 1 


(j = 1,2,- • • p) 



2 — 2 a i 2 x t-i — Na 0 = 0 

t=\ t=i * = i 


These are the normal equations for the estimation of the parameters 

a °' G i» ‘ fl j>- The maximum likelihood estimates are the least squares 
estimates of equation (1). 

The variance a 2 is estimated by: 





2 (X t 

t = 1 



These are consistent estimates. The estimates converge in probability 
tot e true population values as N becomes infinite (section 5.2) We mav 
however doubt their efficiency, especially for the short series which are 

biased"® y enCOUntered ln econometric research. They may also be 


' Mann 3nd A ' Wald 1 "° n the statistical treatment of linear stochastic 
erence equal,ons, ’ Econometrics , vol. 11 (1943), pp. 173 ff. j. R. N. Stone- 

tions ” vw m f at0regreSS,Ve schemes and lin ear stochastic difference equa¬ 
tions, ,b,d„ vol. 17 (1949), supplement, pp. 29 ff. H 

L. Hurwicz: op. cit. 



266 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.3 


We denote: 


( 6 ) 

(7) 

( 8 ) 




Z Xt-iXt-i 
= 1 _ 

N 





2 x t -i 



The inverse of the matrix [ D ijN ] ( i,j = 0, 1, •••/?) is [c ±jN ] (Appendix 

A.LI). In the limit, i.e., for N -> oo, the quantities VN(a i — a/) and 

VN{a 5 — a/) are normally distributed with covariance s 2 • c al y. 

Hence the quantity: 

(9) V N(aj - a-) 

sVc iiN 

is approximately normally distributed with zero mean and unit variance. 
The quantity (9) may be used to establish fiducial or confidence limits, 
test hypotheses, etc. These procedures have of course only asymptotic 
validity. They are not very reliable for small samples. 

For purposes of prediction we have: 

00) X ,+x = a 0 + 2 a iX t -i+i 

i = 1 


This is a point estimate for X t+l . But 

00 

is normally distributed with zero mean and estimated variance s 2 if N 
is large, and hence we may neglect the variances of the a { . 

The quantity (11) may be used in order to find fiducial or confidence 
limits for the predicted value X' t+1 . All these procedures have only 
limited validity for small samples. 

J. Tinbergen starts his important investigations of business cycles in 
the United States 11 with a system of difference equations. He reduces 
this system, however, to a single difference equation of type (1) without 
the error term e t . It is not advisable to proceed in this fashion with 
systems of stochastic difference equations since very complicated error 
terms with autocorrelations and serial correlations of unknown magnitudes 
may arise by such a procedure. This will be the case even if the random 

11 J. Tinbergen: Business Cycles in the United States of America, 1919-1932 
(Geneva, 1939). 


10.3] 


STOCHASTIC DIFFERENCE EQUATIONS 


267 


elements in the original difference equations are not autocorrelated and 
not serially correlated. 

Example 1. Let us use the yearly series of American meat consumption , 

1919-41. The data are deviations from a cubic trend (section 8.1, 

Example 2). We want to fit by the method of least squares a third-order 
linear difference equation (1): 

02) X t = a l X t _ 1 -{- a 2 X t _ 2 -f a 3 X t _ z + a 0 

We use N = 23 items. Then the estimate of the above relationship 

(12) is given by the classical method of least squares: 

(13) X t = 0.2\9X l _ l + 0.281AV 2 + 0.23 \X t _ 3 + 0.430 

The estimated variance of the random element e t is s 2 = 18.271 from 
formula (5). The standard errors of the regression coefficients are 0.229, 
0.198, and 0.221. The ratios of the regression coefficients to the standard 
errors are 0.956, 1.045, and 1.419. But with a normal distribution we 
want at the 5 per cent level a ratio at least as large as 1.96. Hence none 
of the regression coefficients appears as statistically significant. 

Equation (13) may be used for forecasting. Suppose that we want a 
forecast of X t for 1942. The value for 1941 was 0.79; for 1940, 3.53; 
for 1939, — 0.50. Hence we have for our forecast from formula (10): 

(\4) X (+l = (0.219)(0.79) + (0.281X3.53) + (0.231)(- 0.50) + 0 430 

= 1.479 


This estimate is in the limit normally distributed with mean X t+l and 

variance j 2 = 18.271. Hence we may compute the 95 per cent confidence 
or fiducial limits: — 6.906 and 9.864. 

The negative outcome of the tests of significance and the large limits 

lor the predicted value indicate that it is not really very likely that the 

merican meat consumption (deviations from a cubic trend) follows a 

stochastic difference equation of the third order. We have also neglected 

e existence of errors of observation, which are without any doubt 
present in the series analyzed. 

10.3.4 SYSTEMS OF STOCHASTIC DIFFERENCE EQUATIONS 12 
We consider now a system of difference equations: 


( 1 ) 


' Pu 

A i a iik X u _ k -f a i = e it 

J = 1 k = 0 


('= 1 , 2 , 


r) 


H. B. Mann and A. Wald: op cit H F Inn^c- r 

unctions ,n the correlation analysis of time series,” Econometric^ vol. 5 (1937), 
PP ' ' F,sher: ° ur unstable dollar and the so-called business cycle,” 



268 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


Pa is the largest lag of the variable X jt in equation (1). Methods for esti¬ 
mating this lag will be given in section 10.3.5. The system is stationary. 

The e it are random variables which are independent, not autocorrelated 
and not serially correlated. They have zero means and covariances. 

Assume that the matrix a m is the unit matrix (Appendix A. 1.1). In 
each equation there is only one variable which enters without a lag. This 
is to say, a m = 0, for / # j. Other methods for dealing with systems 
of stochastic difference equations have been given in Chapter 7. 

In this case we simply have to minimize: 

( 2 ) I 2: (/ = 1, 2, • • • r) 

i=i 


( 3 ) 


x) 


Qi = I ( 2 2 a iik x jt _ k + a,y 

t = l j =l * = o 


This idea is related to the method described in section 10.3.3. This 
yields the normal equations: 

N r PiJ 

( 4 ) 2 Kit- S ( 2 2 a xik^n-k + a i) =0 (/ = 1 , 2 , • • • r\ u = 1 , 

t= 1 j= 1 /,=o 

v 2, • • • r\ s = 1, 2, • • • p us ) 

(^) 2(2 2 a ijkXj t -k) +■ =0 

t= ] j = 1 k = 0 


The maximum likelihood estimates are the least squares estimates. 
The estimate of the variance of e it is: 

N r p tt 

(6) = 2 ( 2 2 a ijk X it _ k + a.r/N (/ = 1, 2, • • • r) 

1 j =1 k=0 


The derived relationships are not necessarily meaningful in the economic 
sense if the problem of identification has not been solved (section 6.5 
and Chapter 7). 

Example 1. We want to illustrate the statistical treatment of systems 
of stochastic difference equations by yearly data referring to American 
corn production , 1926-40. We denote by X lt the demand for corn, by 
X 2l the price of corn, by X 3( the total supply of corn, and by X xt the 
stocks of corn. 


Journal of the American Statistical Association , vol. 21 (1925), pp. 179 ff. C. F. 
Roos: Dynamic Economics (Bloomington, Ind., 1934), pp. 263 ff. H. T. Davis: 
Analysis of Economic Time Series (Bloomington, Ind., 1941), pp. 104 ff. M. M. 
Flood: “Recursive methods in business cycle analysis,” Econometrica , vol. 8 
(1940), pp. 333 ff. T. C. Koopmans: “Distributed lags in dynamic economics, 

ibid ., vol. 9 (1941), pp. 128 ff. 



10.3] 


STOCHASTIC DIFFERENCE EQUATIONS 


269 


We assume the system in the following form: 


(7) 

( 8 ) 

(9) 

( 10 ) 


X\t = a i2i^2t-i + a u\Xu-\ ~r Qi 

X 2t = ^22l^2t~l “T ^241^4f-l T - °2 

^3f = a 32\^2t-\ "f" a 34\^4!-\ "f" a 3 

X xi = #421^2/-! ~F 0 Ul X 4t _ l -f" O x 


The matrix [tf i;0 ] is evidently the unit matrix. Hence we can immedi¬ 
ately apply the classical method of least squares. This yields the following 
estimates: 

(11) X u = — 2.63003 7 X tt _ x + 0.231732 AV, + 2465.142 

(12) X 2I = 0.391680 X tl _ t + 0.020103 X 4I _ 4 + 88.188 

(13) X 3t = - 5.198178 X 2I ^ + 0.836231 X 4I _ : + 2812.450 

(14) X 4I = - 1.933443 X 2I _ 1 + 0.202037A' 4( _ l + 323.079 

This system can evidently be considered only as purely descriptive, and 

the individual equations cannot be identified with meaningful economic 

relationships. For instance, if (7) was the demand function for corn 

then we would expect at least the variable X 2( (contemporaneous price of 
corn) to appear in the equation. 

A method which is based upon the idea of process analysis, where the 

individual equations have definite economic meaning, will be presented 
in section 10.3.7. 


10.3.5 LAG CORRELATION (SERIAL CORRELATION) 

Systems of stochastic difference equations or schemes of linear auto¬ 
regression of the form of equation (I) of section 10.3.4 involve lag corre¬ 
ctions and lagged regression. We have assumed up to now that the 
ags are known. The situation becomes much more complicated if this 
is not the case and the lags have also to be estimated. (This is in our 
notation equivalent to an estimation of the constants p,,.) In economic 
data we will m general expect distributed lags. 1: > Some methods of 
dealings statistically with distributed lags are given by Alt. 14 

The idea of distributed lags is very important for economic statistics 
nere are many theoretical economic models which deal with the following 
phenomenon: There is a functional connection between the changes in 
one economic variable and subsequent changes in others. These ideas 


13 


14 


c. F. Roos: op cit.y pp. 69 ff. 

F * A,t: D,slr 'buted lags,'' Econometrica , vol. 10 (1942), pp. 113 ff. 



270 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


involve difference equations, 15 differential equations, 16 integral equations, 17 
mixed difference and differential equations, 18 etc. 

The statistical treatment of these phenomena is very difficult. There 
is some hope that we may eventually be able to use the ideas which have 
been recently developed in connection with stochastic processes. 19 Some 
of these ideas will be indicated below in section 10.4. 

Example 1. In his brilliant study. Factors Influencing Residential 
Building , Roos 20 introduces distributed lags. He uses data for the period 
1890-1933 and employs the following notation. 

Let B(t) be the quantity of new building, E(t ) the incentives for building. 
E(t) is approximated by the formula: 




(R-p - T) 0 - 86 • W 
(- 0.86 


where R is rent in dollars, p fraction of houses occupied, C replacement 
cost in dollars, and T is taxes in dollars. 

The function W is: 

(2) W = 0.1 + (0.9)(10) -000045/ 


where /is the number of foreclosures per year per 100,000 families. 
The final formula is: 


(3) B(i) = 0.443 + j' [E(x)e-^- x - 2 ^dx 

t-t 0 


11 909 1 


N< 


t-t, 



15 P. A. Samuelson: “Dynamic process analysis," in H. S. Ellis, ed.: A 
Survey of Contemporary Economics (Philadelphia, 1948), pp. 352 ff. 

16 G. C. Evans: Mathematical Introduction to Economics (New York, 1930), 
pp. 106 ff. G. Tintner: “A ‘simple’ theory of business fluctuations," Econo - 
metrica , vol. 10 (1942), pp. 317 ff. 

17 G. C. Evans: op. cit ., pp. 143 ff. C. F. Roos: op. c/7., pp. 14 ff. H. T. 
Davis: Theory of Econometrics (Bloomington, Ind., 1941), pp. 377 ff. 

18 M. Kalecki: Essays in the Theory of Economic Fluctuations (London, 
1935); Studies in Economic Dynamics (New York, 1944); “A macrodynamic 
theory of business cycles," Econometrica , vol. 3 (1935), pp. 327 ff. R. Frisch 
and H. Holme: “The characteristic solutions of a mixed difference-differential 
equation," ibid., vol. 3 (1935), pp. 225 ff. R. W. James and M. H. Belz: “On 
a mixed difference and differential equation," ibid., vol. 4 (1936), pp. 157 ff. 

19 W. Feller: An Introduction to Probability Theory and Its AppHcations y 

vol. 1 (New York, 1950). 

20 C. F. Roos: op. c/7., pp. 69 ff. 


10.3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


271 


where b is a constant and the constants N x and N 2 are defined by: 

(4) N x = j' e -(t-x-2.25m dx 

t-to 

$ 

(5) N 2 = J’ e-V-z-i.7Wia.2s,o dx 

t-t. 

It is apparent that the distributed lags have a normal form which makes 
them amenable to easy computation. All the constants have been 
computed by Roos by the classical method of least squares. It is some¬ 
what doubtful whether this method is really applicable in this case. 

Example 2. Alt 21 correlates monthly seasonally adjusted figures of 
copper deliveries (*,) with monthly totals of new orders (*,) for the 
period 1920—38. He obtains the following lag regressions: 

< 6 ) Y, = 0.014 + 0.991 A', 

y, = 0.005 + 0.445 A", + 0.560*,_, 

(8) Y t = - 0.002 + 0.487*, + 0.175*,., + 0.353*,_ 2 

(9) Y < = ~ 0 003 + 0.490*, + 0.180*, , + 0.280*,_ 2 + 0.067*,_ 3 

(10) *,=-0.005+0.499*,+0.174*,+0.288*,_ 2 -0.032*,_ 3 +0.091 *,_ 4 

The first equation (6) represents the regression of copper deliveries on 
contemporary orders; the second (7) is the regression of copper deliveries 
on contemporary orders and orders lagged I month. The third equation (8) 
relates copper deliveries to contemporary orders and to orders lagged 1 
and 2 months. Equation (9) is the estimated relation of copper deliveries 
to contemporary orders and orders lagged 1, 2, and 3 months. The last 
equation (10) represents the estimated relationship between the copper 
eliveries and orders which are contemporaneous, lagged 1 month 2 
months, 3 months, and 4 months. 

Alt is aware of the danger of close relationship between the various 
agged and unlagged order series (see section 6.5). He concludes that 
e ourth equation (10) is not valid, since negative regression coefficients 
appear. This is nonsensical from an economic point of view Hence 
he believes that equation (9), which has a maximum lag of 3 months 
represents probably best the “true" relationship between deliveries and 
orders in the population from which the data are a sample. 

The search for the appropriate maximum lag is most important. But 

b ‘ H P Ure , S / re not ver > satisfactory, especially since they are not 
d u P° n val 'd statistical considerations. We have again essentially 

21 F. Alt: op. cit. 



272 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


a multiple choice problem (section 1.2). Perhaps it would be possible 

to treat it in a fashion similar to that used for the problem of choosing 

the appropriate degree of the polynomial which is to represent a trend 
(section 8.1). 

10.3.6 STOCHASTIC DIFFERENCE EQUATIONS WITH ERRORS IN 
THE VARIABLES 

We can formulate a still more general linear system of stochastic differ¬ 
ence equations. Let formula (1) of section 10.3.4 be a system of equations 
which expresses the structural relationships existing in the economy. We 
change, however, the assumptions regarding the random variables e it . 
Let the random variables e it be subject to a similar system of stochastic 
difference equations: 

( 1 ) 2 2 bijk £ jt-k — Cit 

j =1 A=0 

The £ f , are random variables which are not autocorrelated and not serially 
correlated. In general, the variables e it are now correlated, they are 
also autocorrelated, and there are serial correlations between them. 

We observe, however, not the “true” variables X it but variables which 
are subject to certain errors of observations 

(2) Y it = X it + rj it (/ = 1, 2, • • • /■; t = 1, 2, • • • jV) 

It is possible that even the errors of observation rj it are subject to a 
system of stochastic difference equations similar to (I): 

r p" ii 

(3) 2 2 c uk r i}t-k = pit 

1 A =0 

The p it are not autocorrelated and not serially correlated. In general, 
the errors of observation rj it are correlated, they are also autocorrelated, 
and there is serial correlation between them. Von Szeliski 22 has indicated 
that there might be negative autocorrelations. 

We make certain assumptions about the joint distribution of the 
and the p it . Given nothing but the observations (2), can we estimate 
the constants a ( and a t)k ? Can we make predictions regarding the X lh 
especially as far as their oscillatory nature is concerned? 

Needless to say, there are as yet no answers to these questions. The 
traditional methods of least squares and maximum likelihood seem not 
to be able to help us with this general problem. It is quite possible that 
we may not be able to make estimates in this very general case, and we 


22 V. von Szeliski: “Analysis of random errors in time series," Appendix II 
in C. F. Roos: op. cir. 


10 . 3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


273 


may have to supplement our hypotheses indicated above by some more 
a priori assumptions. 

An estimation method for a second-degree stochastic difference equation 
with superimposed errors of observations has recently been given by 
S. G. Ghurie. 23 

Another method- is due to M. H. Quenouille. 24 Assume that we have 
a second-order stochastic difference equation: 

( 4 ) x t+2 + ax t+1 + bx t = e t+2 

The random variable e t has mean zero and variance a 2 , and the series 
is not autocorrelated. 

We observe, however, not the variables x t (t = 1, 2, • • - A) but the 
variables: 


(5) y t = x t + rj t 

The rj t are superim'posed errors of observations (section 6.5). Their 
mean is zero, their variance r 2 , and they are not autocorrelated. There 
is also no correlation or serial correlation between e ( and 
We have evidently: 

(6) Ey ( 2 = Exp + r 2 


(7) 


But from formula (12), section 10.3.2, we have for the estimation of o 2 : 

(1 b)o 2 


Ex 2 = 


We define: 


(1 -*)[(! + b) 2 -a 2 \ 


( 8 ) 


T 


Ex t 


2 


and have the following equations for the estimation of the constants 
a , />, and c : 

a 2 p - M + 2ar 2 + r 3 = 0 

\ri-r 3 / 


(9) 





S. G. Ghurie: “A method of estimating the parameters of an autoregressive 

time series," Biometrika , vol. 37 (1950), pp. 173 ff. 

24 M. H. Quenouille: "A large sample test for the goodness of fit of auto¬ 
regressive schemes,’ Journal of the Royal Statistical Society, vol. 110 (1947), 
pp. I 23 ff. 



274 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


In these formulae r 1# r 2 , and r 3 are the first three autocorrelation co¬ 
efficients of the observed series y t . We have to choose the root of the 
quadratic equation (9) which makes the estimate of c in formula (11) 
positive. 

Example 1. We analyze the yearly series of meat consumption. The 
data are deviations from a cubic trend (section 8.1, Example 2). There 
are 23 yearly observations. 

The first three autocorrelation coefficients are r Y = 0.0895, r 2 = 
— 0.3513, r 3 = — 0.1779. We assume that the unknown “true” values 
of the series follow a second-order stochastic difference equation [formula 
(4)]. There is, however, also a superimposed error of observation [for¬ 
mula (5)]. 

The quadratic equation (9) is in this case: 

(12) - 0.431567a 2 - 0.702600a - 0.177900 = 0 

The two roots are — 1.314345 and — 0.313676. The second root gives 
a positive value to c [formula (11)] and hence we take a = — 0.313676. 
We have also from formula (10) b = 0.756487, and from formula (II) 
c = 0.995318. Hence the stochastic difference equation (4) becomes: 

(13) .y, + 2 — 0.3I3676xy fl + 0.756487 jc, = e t+2 

This estimate should be compared with the results in Example 2, 
section 10.6.2, where no errors of observation are assumed to exist. 

We have as an estimate of Ey t 2 the value 23.833. This is simply the 
variance of the empirical series of observations. From formulae (7) and 
(8) we have Ex, 2 = 11.944. The estimates of the variances of our random 
elements are o 2 = 4.946 and r 2 = 11.889. Hence it appears that the 
estimated variance of the errors of observations r\ t is more than twice as 
large as the variance of the random element e, which underlies the 
stochastic difference equation (13). 

These results are perhaps not very reliable for the following reasons: 
(I) Our series is quite short. (2) We do not know whether a second-order 
stochastic difference equation is really sufficient; a stochastic difference 
equation of higher order may be more appropriate. (3) The random 
elements e , and //, may have autocorrelation; there may also be correlation 

and serial correlation between them. 

If we have some confidence in our results, we may interpret them in 
the following way: The series of meat consumption (apart from the 
trend) is subject to rather large errors of observations. But the irregularly 
periodic fluctuations in this series can, to a certain extent, be explained 
by a second-order stochastic difference equation, which determines the 
way the series would develop in the absence of errors of observations. 
The random element e t represents various random influences on the 



10 . 3 ] 


STOCHASTIC DIFFERENCE EQUATIONS 


275 


consumption of meat, such as weather influences, epidemics, and irregular 

changes in taste. The left-hand side of the difference equation (13) 

shows influences of various lags, like the period which is necessary to 

raise cattle and hogs. It may also show various speculative influences 
which exist in the meat market. 


10.3.7 STOCHASTIC DIFFERENCE EQUATIONS AND PROCESS 
ANALYSIS 


A method of using stochastic difference equations in econometric 
research has been developed by Benzel and Wold. 25 It is based upon the 
idea of process analysis, which is now widely used in theoretical economics. 
Hence the individual equations are identified (section 6.5 and Chapter 7). 
The matrix of the non-lagged coefficients is triangular. 

Let X lt be the demand (quantity purchased) of a commodity at point 

in time t. The price of this commodity is X 2l . The supply (quantity 

produced) is X 3t . The stock of the commodity at the end of a time 
period is X xi . 

Let us assume that the demand depends only upon the contemporaneous 
price. Assume further that the supply of the commodity depends upon 
the price a period before and upon the stocks existing in the period before 

(t-l). (See section 3.2.) The price depends upon the price a period 

before, the contemporaneous supply, and the stocks of the period before 

Stocks finally depend upon stocks a period before. The system is linear 

This economic model is evidently related to the well-known “cobweb” 
phenomenon. 26 


These assumptions are very reasonable from the point of view of 
dynamic economics or process analysis. All the equations are identified 
with meaningful economic relationships. Let all relationships be linear. 
Let a., b„ c„ d, be constants and non-autocorrelated, not serially 
correlated deviations or errors, due to variables not included in the 
relationships (Chapter 7). 


25 R. Benzel and H. Wold: "On statistical demand analysis from the view¬ 
point of simultaneous equations," Skandinavisk Aktuarietidskrift , vol. 29 (1946) 

pp. 95 ff. H. Wold: "Estimation of economic relationships," Econometriea 
vol. 17 (1949), supplement, pp. Iff. ’ 

2 ' E. Lundberg: Studies in the Theory of Economic Expansion ( London, 1937) 
h. Lindahl: Studies in the Theory of Money and Capital (New York 1939) 
P. A. Sarnuelson: Foundations of Economic Analysis (Cambridge Mass’ 1947) 
pp. 257 ff; "Dynamic process analysis," in H. S. Ellis, ed.: A Survey 'of Con¬ 
temporary Economics (Philadelphia, 1948), pp. 352 ff. L. Hurwicz- "Stochastic 
models of economic fluctuations," Econometriea, vol. 1 1 ,1944) pp 114 ff. 

A. Smith.es: "Process analysis and equilibrium analysis," ibid., vol 10 (194^)' 
pp. 26 ff. 



276 


INTERDEPENDENCE OF OBSERVATIONS 


[10.3 


Under these assumptions the stochastic model becomes: 

(0 X lt = a 0 -f a x X 2t + e lt 

This is the demand function. 

X*t = b 0 + b 1 X 2t _ l + b 2 X 4t _ x -f e 2t 
This is the supply function for the commodity. 

^21 C 0 4" c \^2t-\ 4" c 2 X 3( -f- C 3^4/-l 4” e 3t 

This is a difference equation which “explains” the price. 

v 4 ) X = d 0 4 - d 1 X 4t _ l -J- X 3t — X lt -f- £ 4 , 

This equation “explains” the holding of stocks. 

It has been shown in the article of Benzel and Wold that under these 
assumptions the classical method of least squares (section 5.1) can be 
used to estimate the constants which enter into systems (1) to (4). This 
follows from the application of the maximum likelihood method to the 
whole system (section 5.2). We obtain consistent estimates. (See also 
section 10.3.4 above.) No standard errors are given. It is also important 
to note that errors of observations (section 6.5) have been neglected. 

Example 1. We want to illustrate the ideas of Wold by an application 
to American corn production , 1926-40. The data are yearly. We denote 
by X u the demand for corn, i.e., domestic disappearance in millions of 
bushels. X 2t is the average price of corn per bushel, 5 markets, all classes 
of grades. X 3t is the total supply of corn in millions of bushels. X 4t , 
finally, is the stocks of corn in millions of bushels. This is carry-over 
on October 1, minus exports plus imports. The data are taken from the 
Yearbook oj Agricultural Statistics. 

According to Wold, we may use the classical method of least squares. 
This will yield maximum likelihood estimates (section 5.2). This gives 
us for the demand equation (1): 

(5) X lt = - 5.784^, -+- 2722.579 

We may compute the elasticity of the demand for corn. All the elas¬ 
ticities are computed for the averages during the whole period. Our 
estimate of the price elasticity of the demand for corn is —0.164. 

The supply function for corn (2) is: 

(6) X 3t = - 5.198A' 2 ,_ 1 + O.S36X 4l _ l + 2812.488 

The elasticity of the supply of corn with respect to last year's price is 
again estimated for the averages of our variables over the whole period. 

It is —0.132. It is surprising that it is negative. The elasticity of the 
supply of corn with respect to stocks of the year before is 0.068. 

The equation which “explains” the price of corn (3) is: 

(7) X 2 t = 0.266X 2t _ 1 - 0.024X 3 , + 0.041 AVr + 103.085 


STOCHASTIC PROCESSES 


277 


10.4] 


We can again compute elasticities. The elasticity of the price of corn 

with respect to last year's price is 0.269. The elasticity of the price of 

corn with respect to the contemporaneous supply of corn is —0.958. 

The elasticity of the price of corn with respect to the stocks of corn one 
year before is 0.134. 

Finally we have an equation which “explains” the holding of stocks 
of corn (4): 


( 8 ) 


x it = 1.723T 4( _, + x 3l - X u + 4848.498 


The elasticity of the stocks in one year with respect to the stocks of the 
year before is 1.573. 

All these relationships are reasonable, with the exception of the supply 

function (6). But we should not forget the severe limitations under which 
our estimates have been derived: 

< I) We have considered the corn market as an isolated market. Actu¬ 
ally it is part ol the total economy (sections 3.7 and 3.8). Especially 

prices of other cereals and related economic quantities can be expected 
to exert considerable influence. 

(2) There are only lags of one year. It is possible that our result might 
oe improved by introducing longer lags (section 10.3.5). 

(3) Our data are rather imperfect and subject to rather large errors of 
observat'on 5, which have been completely neglected (sections 6.5 and 

(4) The relationships cannot really be considered linear over the period 
relationships^ 0 ‘ ^ ^ made ’ hOWeVer ’ to introduce non-linear 

(5) The results give only consistent estimates. But our series is rather 
short. Hence the validity of the estimates may be seriously doubted on 

EiOUnQ, 

' 0 4 Processes '' 0 D ' fferential Ec l uations and General Stochastic 

The problem of stochastic differential equations is much more difficult 

than the treatment of stochastic difference equations. Nevertheless it 

appears that stochastic differential equations are really the type ’ of 

- oc astic processes which may describe most adequately “dynamic" 

economic re ationships. This has been indicated in the author’s busffiess 
cycle theory' (section 3.8; see also section 8.3). 

vol .^0 ° f . bus i n « s fluctuations," Econometric, 

tentative venfca^on / ' ^ S ' m P le 'heory of business fluctuations: A 

vent,cation. Renew of Economic Statistics, v 01.26 (1944) pp 148 ff 



278 


INTERDEPENDENCE OF OBSERVATIONS 


[10.4 


Estimation problems in this field present unusual difficulties. We will 
content ourselves with a presentation of a linear theory of stochastic 
differential equations without trying to approach the problems of esti¬ 
mation. The distribution questions connected with this stochastic scheme 
also present very great difficulties. 

Let us assume that the economic system can be described in terms of a 
system of stochastic differential equations. Let the variables be Xft) 
and the disturbances eft). Then we have a system of n stochastic 
differential equations: 

(1) XfO = Z A {j Xft) -T A t --f- eft) (/' = 1,2,* • • ri) 

5= i 

The quantities A f and A iS are constants. By Xft) we denote the time 
derivative of Xft). Great mathematical difficulties arise in connection 
with the definition of this derivative. 

These are the economic relationships as they exist in our model. But 
actually we observe not the variables Xft) but certain discrete variables 
yu at TV points in time (1,2,* • • /V): 

(2) Y it = Xft) -f flu (t = 1, 2, • • • N; / = 1, 2, • • • n) 

Here the random variables rj it are errors of observations. 

We have to make convenient assumptions about the random variables 
€ t(0 an d ^e errors of observations r) it . After adopting a stochastic 
model of this nature, we would like to answer the following questions 
on the basis of a knowledge of the observed values Y it : 

L Can we estimate the structure of the system, i.e., the constants A if 
A,j and the variances and covariances of the random variables eft) and 
Vit • 

2. Can we make predictions about the variables Xft), especially 
regarding certain periodicity properties or oscillatory properties? 

3. Can we find fiducial or confidence limits for the estimates and 
predicted values (1) and (2)? 

These are still open questions and certainly very difficult to answer. 
Some promise lies, however, in recent work on stochastic processes. 

Instead of differential equations we may also use integral equations, 
mixed difference and differential equations, or even more complicated 
forms of functional relationships (see section 10.3.5). The theory of 
stochastic processes provides treatments for certain special cases. 2 

2 N. Arley: On the Theory of Stochastic Processes and Their Application to 
the Theory of Cosmic Radiation (Copenhagen, 1934). W. Feller: “Die Grund- 
lagen der Volterraschen Theorie des Kampfes urns Dasein in wahrscheinlich- 
keitstheoretischer Behandlung,” Acta Biotheoretica , vol. 5 (1939), pp. 11 ff.; 



10.5] LEAST SQUARES METHOD FOR CORRELATED ERRORS 279 

Stochastic processes have been used very extensively in connection 

with problems in physics. It is possible that some of the methods 

developed for the solution of physical problems may also be useful in 

economic statistics. But it should be emphasized that most of the theory 

which exists today is large sample theory and hence not necessarily very 

useful for the treatment of problems in economics, where our time series 
are short. 

10.5 The Least Squares Method for Correlated Errors 

It has been shown by Aitken 1 that the method of least squares can be 

linear equations if the errors or deviations are 
autocorrelated (section 10.1). The Markoff theorem applies. The 


“On the theory of stochastic processes,” in J. Neyman, ed.: Proceedings of the 
Berkeley Symposium on Mathematical Statistics and Probability (Berkeley, Calif., 
19 49) pp. 403 ff. O. Lundberg: On Random Processes and Their Application 
to Sickness and Accident Statistics (Uppsala, 1940). H. T. Davis: Analysis of 
Economic Time Series (Bloomington, )nd., 1941), pp. Ill ff N. Wiener: 
Generalized harmonic analysis," Acta Mathematica, vol. 55 (1930) pp 1 17 ff 
Cybernetics (New York. 1949); Extrapolation, Interpolation, and Smoothing of 
Stationary Time Series (New York, 1949). M. S. Bartlett: Stochastic Processes 

a ^' g ’ ' C " l946) ’ H Cramer: “Problems in probability theory," Annals 

Of Mathematical Statistics, vol. 18 (1947), pp. 105 ff. M. Frechet: Recherches 
theonques sur la theorie de probabilite, vol. 2 (Paris, 1937). K. Karhunen: 

Ueber lineare Methoden in der Wahrscheinlichkeitslehre," Annates Academiae 
Scent,arum Fennicae, series A. I. Matematica-Physica 37 (Helsinki 1947) 
L. Doob: “The elementary Gaussian process," Annuls of Mathematical Statis¬ 
tics, vol 15(1944). pp. 229 ff. M. S. Bartlett: “On the theoretical specification 
o sampling properties of autocorrelated time seiies," Journal of the Royal 
Statistical Society, vol. 8 (1946), supplement, pp. 27 ff j. E. Moyal: “Sto- 

iqaq Pr °T CS r nd S ' atlStlCal Physics," Journal of,he Royal Statistical Society 

1949, series B, vol. II, pp. 150 ff M. S. Bartlett: "Some evolutionary stochastic 

processes," ibid, pp. 2, I ff D. G. Kendall: “Stochastic processes^nd popu¬ 
lation growth, ibid, pp. 265 ff. W. Feller: An Introduction to Probability 

mvol vin Applications, vol. I (New York, 1950). T. C. Koopmans: "Models 
nvolving a continuous time variable," in T. Koopmans, ed.: Statistical Inference 
'"Dynamic Economic Models (New York, 1950), pp. 384 ff P Whittle: 

“Stochastic '"7 Anal - Vsis (U PP sala - '951). U. Grenander: 

. S,, r P “T« “ inrer ' n “'" Art,rfSr 1 

1 A. C. Aitken: “On least squares and linear combination of observations ” 

Proceedings of the Royal Society of Edinburgh, vol. 55 ( 1934 - 35 ), pp. 42 ff 



280 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.5 


result of the application of the modified method of least squares will 
yield the best unbiased linear estimates 2 (section 5.1). 

Let us assume that we have a series of observations: Y lt Y 2 , • • ■ K v . 

We want to fit k prescribed quantities which are linearly independent: 

-*»<• x 2 h x kt (l = 1, 2, • • • /V). The condition of linear independ¬ 
ence excludes multicollinearity (section 6.5). Let a 0 , a u • • • a k be a 

set of constants to be determined by the method of least squares. The 
fitted functions are: 

k 

(■) Z, = o 0 + 2 a , x u (t = 1, 2, • • • N) 

i = 1 

The errors or deviations are: 

< 2 > c, = y t - z t 

Let us assume that we know the variance and the autocovariances of 
the errors: 

< 3 ) EC r Cs = u rs (r, s = 1, 2, • • • N) 

Let us define: 

( 4 ) [«-] = [«„]-> 

So that u rs is an element of the matrix which is inverse to the autovariance- 
covariance matrix of the errors (Appendix A. 1.2). 

The weighted sum of squares which is to be minimized is now: 

( 5 ) Q = I 1 tr" rS L 

r = I s = I 

Maximizing (5) with respect to the constants a 0 , a x , • • • a k yields the 
following normal equations: 



A X x 

A* 

X 


(6) 

1 a i I I x ir u rs x is 

, = 2 

Y u rs Y 

ir * 8 

(/ = 0, 1, • • • k) 


j — () r = | 8 = 1 

r= 1 

8 = 1 


where 

*ot = L 





It is evident that expressions (5) and (6) become the sum of squares to 
be minimized and the normal equations of the classical method of least 

2 D. Cochrane and G. H. Orcutt: “Application of least squares regression 
to relationships containing autocorrelated error terms,” Journal of the American 
Statistical Association , vol. 44 (1949), pp. 32 ff. D. G. Champernowne: 
“Sampling theory applied to autoregressive sequences,” Journal of the Royal 
Statistical Society , series B (methodological), vol. 10 (1948), pp. 204 ff. 



10.5 ] LEAST SQUARES METHOD FOR CORRELATED ERRORS 


281 


squares if the matrix [u r J is a diagonal matrix (Appendix A. 1.1). All 
the elements not on the diagonal of this matrix are zero. In this case 
the errors or deviations are not autocorrelated, and the classical method 
of least squares can be applied without any modification. 

It is evident that these ideas are very important for econometric applica¬ 
tions. Our endeavor to transform the errors or deviations in such a way 
that they are independent, i.e., not autocorrelated, will not always be 
successful. These methods are discussed in Chapter 11. Besides, we 
may have to smooth deviations which are originally independent by 
moving averages. This will introduce autocorrelations, as we have seen 
(section 8.2). It is also possible that the errors may follow an auto¬ 
regressive scheme or stochastic difference equation (see section 10.3). 
In these cases we will have to deal with autocorrelated errors. 

One difficulty with Aitken’s method is that it assumes knowledge of 
the covariance matrix of the errors [w r J in the population. This know¬ 
ledge will hardly ever be available with empirical economic tine series. 
Hence we will have to estimate the matrix from our data. This will 
impair the practical efficiency of the method and may introduce serious 
difficulties, especially with short series. Errors of observations are also 
neglected. Another shortcoming of the method is the lack of distribution 
theories which might enable us to compute standard errors, test hypo¬ 
theses, etc. 

Consider a simple example of this method: Let equation (1) to be 
fitted be: 


X\t — k 2 X 2 ( -j- (/ = 1, 2, • • • A) 

Hence we have Z, = X lh x lt — X 2( in the notation of equation (1). 
Suppose that the £, obeys an autoregressive scheme of the form: 

(8) it — Bit- 1T- 

where the e t form a pure random series with mean zero and variance a 2 , 
and are not autocorrelated. B is a constant, less than I in absolute value. 
This is a stochastic difference equation of the first order (section 10.3.1). 

Then, for a long series, the matrix of the variances and covariances of the 
errors [i/ r J in equation (3) is: 



B B 2 • • • B n ~ 

I B • • • b n ~ 1 

B I ... &N -2 



n 


B 


B n ~ 2 


1 


i 




282 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.5 


Its inverse (4) is 


( 10 ) 


a 1 


1 

- B 
0 


- B 

1 + B 2 

- B 


0 

- B 

1 + B 2 


0 

0 

0 


0 

0 

0 


0 

0 


0 

0 


0 

0 


1-1 - B 2 — B 


- B 


1 


Using the general methods indicated above we define the means of the 
variables: 

-V .v-1 

2 * it -BZ X it 


( 11 ) 


X, = 


t= i 


t = 2 


N- B{N — 2) 


0 = 1 , 2 ) 


We denote deviations from the means by small letters: 

( 12 ) 


x a = *it - X\ (/= 1 , 2 ; / = 1 , 2 , 


• • 


N) 


Then we have for an estimate of the regression coefficient: 

* * x X 

2 *1/*2I ~ B( 2 X lt X 2t -i -f- 2 *1<-I*2f) B 2 2 *1,-1*2,-1 


(13) 


k 2 — 


t =1 


t = 2 


t = 2 


/ = 3 


2 x 2 2 ( - 2B 2 * 2 ** 2 , -i + B 2 2 x 2 2t _ x 


t= i 


* = 2 


t = 3 


One shortcoming of this method which is due to Orcutt 3 is of course 
that we must know the constant B which characterizes the autoregressive 
nature of the errors or deviations £„ as pointed out by the author. This 
will infrequently be the case in statistical practice, as has been remarked 
(section 10.3.1). 

Example 1. We want to illustrate the method of least squares with 
correlated errors by considering the relationship between two series: 
Let x lt be the quantity of meat consumed, and x 2t the price of meat. 
Under certain conditions this may yield the demand function for meat 
(Chapter 2). The period covered is 1919-41; N = 23. Both series are 
deviations from cubic trends. 

An application of the classical method of least squares gives: 

(14) * w - - 0.225448;c 2 , 


3 D. Cochrane and G. H. Orcutt: “Application of least squares regression 
to relationships containing autocorrelated error terms,” Journal of the American 
Statistical Association , vol. 44 (1949), pp. 32 ff. 




10.5 ] LEAST SQUARES METHOD FOR CORRELATED ERRORS 


283 


We compute the price elasticity of the demand for meat at the means 
of our variables. It is estimated as —0.125263. 

There is some autocorrelation of the series x lt . It is estimated that 
B = 0.0895. This is again a result of the application of the method of 
least squares (section 10.3.3). 

The matrix of the variances and covariances of the errors in y lt is then 
from (9): 


(15) 

0.991998 


1.000000 0.089500 0.008100 0.000717 0.000064 0.000006 • • •“ 

0.089500 1.000000 0.089500 0.008100 0.000717 0.000064 • • • 

0.008010 0.089500 1.000000 0.089500 0.008100 0.000717 • • • 


The inverse matrix is computed with the help of formula (10): 


(16) 

1.008067 


1.000000 - 0.089500 
- 0.089500 1.008010 

0.000000 - 0.089500 


0.0000000 0.000000 

- 0.0895000 0.000000 

1.008010 - 0.089500 


0.000000 
0.000000 • • • 
0.000000 • • • 


From these data we derive the required least squares estimate, taking 
into account the intercorrelation of the errors from formula (13): 

( ,7 ) x u = - 0.225250.y 2 , 

The elasticity is again estimated at the means of our variables. It is 

-0.125153. This is really not very different from the previous estimate 

( 0.125263), which did not take the autocorrelation of the errors into 
account. 

This result ought to be interpreted with due caution. There are a 
number of circumstances which make its validity doubtful: (1) The esti¬ 
mate ot B which characterizes the autoregressive nature of the errors is 
not very reliable because of the shortness of the series analyzed. (2) 
Errors of observations, which are doubtlessly present in both series, have 
been neglected. (3) The demand function may not really be linear over 
the period analyzed. (4) Other variables apart from the price of meat 
have to be taken into account, and a system of equations ought to be 
considered (sections 3.7 and 3.8) in order to permit identification of the 

empirical relationship as the demand function for meat (section 6 5 and 
Chapter 7). 





284 


INTERDEPENDENCE OF OBSERVATIONS 


[10.6 


10.6 Correlogram Analysis 

A very important tool for the analysis of time series has been developed 
by Wold for the investigation of stationary discrete stochastic processes. 
Its utility is somewhat impaired by the lack of a small sampling theory, 
and it ought to be used with caution, especially with short economic time 
series. Stationary processes are such that the probabilities involved are 
unchanged if the time axis is subject to a translation. If there is a trend 
it must first be removed before starting the analysis (Chapter 8). 

The correlogram analysis 1 is applied to the process of moving averages 
(section 10.6.1) and to the process of linear autoregression or stochastic 
difference equations (section 10.6.2). These stochastic processes can be 
distinguished by their correlogram. For moving averages the correlogram 
is zero beyond a certain lag (section 8.2). For the process of linear 
autoregression or stochastic difference equations the correlogram is a 
damped sinusoidal oscillation (section 10.3.2). 

In practical applications, however, we should remember that the 
empirical correlogram is not too reliable, especially if errors of observations 
are present. The empirical correlogram may depend more upon the 
particular properties of the sample (i.e.,.the empirical time series) than 
upon the population. 2 

Moving averages have been considered in section 8.2. The process 
of linear autoregression or stochastic difference equations has been 
discussed in section 10.3. 

It is interesting to note the relationship between periodogram analysis 
and autocorrelation coefficients. Let R n 2 be the squared amplitude 
defined in section 9.1, formula (3). The r k are the autocorrelation coeffi¬ 
cients of the series (section 10.1). We have: 



E(Rn 2 ) 



360/7 k 
N 


This formula becomes /? v 2 , i.e., formula (4) of section 9.2, if we have 
r 0 = 1, r k = 0 for k ^ 0. In this case we have the scheme of hidden 
periodicities, arid the methods of periodogram analysis (section 9.2) are 


1 H. Wold: A Study in the Analysis of Auto-regressive Time Series (Uppsala, 
1938). H. T. Davis: The Analysis of Economic Time Series (Bloomington, 
Ind., 1941), pp. 102 ff. 

2 M. S. Bartlett: “Some aspects of the time correlation problem in regard 
to tests of significance," Journal of the Royal Statistical Society , vol. 98 (1935), 
pp. 536 ff.; “On the theoretical significance of sampling properties of auto- 
correlated time series," ibid., vol. 8 (1946), supplement, pp. 27 ff. 



10 . 6 ] 


CORRELOGRAM ANALYSIS 


285 


indicated. The periodogram of the stochastic process of hidden periodi¬ 
cities is a harmonic oscillation with constant amplitude. 


10.6.1 MOVING AVERAGES 3 

Let us assume that the series x, has been created by a process of moving 
averages (see section 8.2). Let e t be a random series with mean zero, 
variance a 2 , and assume that it is not autocorrelated. Then we have: 



x, — e t -f b l e t _ l + • • * 4- b h e t _ h 

0 


We have noted that the serial correlation coefficients satisfy the relations 
[formula (12) of section 8.2.3]: 




// 

y 

s = 0 


bb 


.s' + 




where b 0 — 1 so that r L — 0 for L > h. The expression in the denom¬ 
inator of (2) is the variance of x, if the variance of the random element 
£ t is 1 [section 8.2.3, formula (10)]. 

Let us assume that the empirical non-zero autocorrelation coefficients 

are, i» / * 2 » ' ' ‘ r h- In practice this means that in the empirical correlogram 
we neglect the autocorrelation coefficients r h+l , / „ +2 , • • • because "they 

appear to be small enough. An idea about their statistical significance 
may be gained by the methods discussed in sections 10.1.1 and 10.1.2. 
We form a polynomial: 





In order to find its roots we substitute z = jc -f 1 .v. Then we have- 


(4) 


v (-) = v 0 + VjZ -f 


V 


-h 


The coefficients v lt v 2 , • • • v h can be computed from (3). 

Let the roots of this polynomial be z l% z 2 , • • • z h . There is no stationary 
stochastic process represented by a moving average if there is a root of 
odd multiplicity in the real interval: 


* 5) - 2 < z, < 2 

If (5) holds we may assume that the empirical autocorrelation coeffi¬ 
cients have been influenced by errors of observations and may adjust 
them, so that (5) does not hold. 


:l Wold: op. cit., pp. 121 ff., 150 fT. 



286 


INTERDEPENDENCE OF OBSERVATIONS 


[10.6 


We get the roots of (3) from the relation: 


( 6 ) 


x, = - ± 
' 2 ^ 


z.- 


1 


Assume that we have k different roots x t . Then we form all possible 
different products of h terms: 

(7) (x - x (t X* - x it ) • • • (x - x,J = + b lX h ~ l + • • ■ + b h 

The coefficients of (7) give all possible moving averages (1) which have 
the autocorrelation coefficients r x , r 2 , ■ • • r h . 

A large sample test of the goodness of fit of the scheme of moving 
averages has been given by Wold. 4 We define first a matrix [*,,] whose 
elements are the autoregression coefficients of the correlogram: 


( 8 ) 


cc 


X ik 2 r j r j+i-k 
j=-co 


The r , in this formula (8) are the autocorrelation coefficients 1, r l9 r 2 , 


• • 


We have, for instance, for h — 1 : 


( 9 ) 

( 10 ) 

(ID 


x a — i 4- 2r 1 


X i,i+ 1 ~ X i,i~ 1 — 2 r i 

2 

X i,i + 2 = X i,i— 2 = r l 


All other elements of the matrix [x j; ] are zero. 
For h = 2 we have: 


(12) 

X it = 1 + 2 z-! 2 + 2 r 2 2 

(13) 

x i,i+ 1 = x i,i~ i = 2/-! + 2r,r 2 

(14) 

X U+ 2 = x i,i- 2 = 'i 2 + 2r 2 

(15) 

X i,i+S — X i,i- 3 ~ ^ r l r 2 


All other elements of the matrix [x i} ] are zero. 

From this matrix we derive the triangular matrix [y ik ] (y lk = 0 if k > i) 
This matrix has only zero elements above the principal diagonal. 

We have: 


(16) 

(17) 


v ’“n — x 


n 


J’lO’21 — X 12 


4 H. Wold: “A large sample test for moving averages," Journal of the Royal 
Statistical Society, series B (methodological), vol. 11 (1949), pp. 297 ff. 



10.6] 


CORRELOGRAM ANALYSIS 


287 


(18) 

} 21 ^ 22 — -^*22 

(19) 

Til. 1 31 — V 13 

(20) 

+ T22.V32 “ ' V 23 

(21) 

} *3l ~b ,V 2 32 “b ,V**33 = -*33 

The general formula is: 

(22) 

} i\) k\ "b \ i2)'k2 1 T }'iicYick — 

The matrix [z 
Its elements are 

,■*] is the inverse of the matrix fr, A .]. 
computed in the following way: 

(23) 

Tn z n = • 

(24) 

T21-11 b T22-21 ~ 0 

(25) 

)'l2 Z 22 — 1 

(26) 

d'32 Z 22 f , l 33 Z 32 = 0 

(27) 

. r 33 Z 33 ~ 1 

(28) 

T3l Z ll ' .V’32 Z 21 "b ,t .33 Z 3i = 0 

The general formulae are: 

(29) 

) } ii Z ii = • 

( 3 °) y.kZkk 

+ }\k + \ z k+\,k + • ’ * + y u z ik = 0 

Then we form the functions: 

(31) 

R-h + i ~t\ r h + \ ■ Z i2 r h + 2 T * * * T 


It is also triangular 


(/ ^ k) 


it' h f i 


It should be noted that these functions do not involve the h first auto¬ 
correlation coefficients. 

It can be shown that for large samples the function: 

( 32 ) v 2 — (N — <\R 2 


r = ( N -s )R ? 


IS distributed like y 2 with I degree of freedom. A more comprehensive 
test is given by: 

(33) *a 2 = i (N-s)R 2 

* = // : I 


This quantity is for large samples distributed like / 2 with k degrees 
of freedom. 

Example 1. Wold analyzed the correlogram of Sir William Beveridge's 



288 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.6 


yearly wheat price index, 1770-1869 (trend removed). 5 We give in 
Table 1 some of the empirical autocorrelation coefficients. 


Lag L 

0 

1 

2 

3 

4 

5 

6 
7 


TABLE 1 

Autocorrelation Coefficient r L 

1.000 

0.614 

0.090 

- 0.156 

- 0.115 

- 0.006 
0.003 

- 0.006 


Wold decided to neglect the empirical autocorrelation coefficients for 
lags larger than 2. They are indeed rather small. Hence he wants to 
determine a stochastic model of moving averages [formula (1)] with 
h = 2: 

(34) .v, = e, -4- h l r ( _ l + b 2 e ( _ 2 

This process should yield the autocorrelation coefficients r l = 0.614 
and r 2 = 0.090. The polynomial (3) is: 

(35) w(v) = I + 0.614 fv + !_) + 0.090 | x 2 + ',) 

We make the substitution z = x -b l/.v and obtain polynomial v(z) 
from (4): 

(36) v(z) = 0.820 + 0.614z 4- 0.090z 2 

The roots of (36) are z x = — 1.82 and z 2 = — 5.00. Hence one root 
lies in the prohibited interval (— 2, 2). From this we must conclude 
that there is no stationary stochastic process of the type of moving 
averages which has the autocorrelation coefficients r x — 0.614 and 
/•.> 0.090. 

But Wold modifies the second autocorrelation coefficient to r 2 = 0.114. 
This is a rather dubious procedure. He tries now to determine a sto¬ 
chastic process of the type of moving averages with the autocorrelation 
coefficients /•, = 0.614, /*., = 0.1 14. Equation (3) is now: 

(37) m(.v) = I + 0.614 (.v + 0 + 0.114 + 2) 

5 H. Wold: A Study in the Analysis of Auto-regressive Time Series (Uppsala, 
1938), pp. 150 If. W. H. Beveridge, “Weather and harvest cycles," Economic 
Journal, vol. 31 (1921), pp. 420 ff. 


10.6] 


CORRELOGRAM ANALYSIS 


289 


The polynomial v(z) (4) becomes: 

(38) v(z) = 0.772 + 0.614z + 0.114z 2 

The roots of (38) are z l = — 2.000, z 2 = — 3.386. No root lies in 
the forbidden interval (— 2, 2). Hence a stationary stochastic process 
with the given two autocorrelation coefficients exists. 

From these values we find, for the roots of u(x) [formula (3)] by 
relation (6), x x = x 2 = — 1, * 3 = — 0.3269, and .v 4 = — 3.0591. From 
these roots we form two distinct polynomials (7): 

(39) (x -f 1 )(.v + 0.3269) = .v 2 -f 1.3269* -f 0.3269 
and 

(40) (x + 1)(* 4- 3.0591) = .v 2 + 4.059l.v 4- 3.0591 

The coefficients of the powers of* in (39) and (40) give the weights or 
coefficients of the moving averages. To equation (39) corresponds: 

(41) *, = e, 4- 1.3269e f _j 4- 0.3269f,_ 2 

Finally, to equation (40) corresponds the following moving average: 

(42) x t = E t 4- 4.059 le^! 4- 3.059 \e t _ 2 

These are the only moving averages which define a stationary stochastic 
process and have the autocorrelation coefficients r x = 0.614, r 2 = 0.1 14, 

r L = 0 for L > 2. They correspond approximately to the autocorrelation 
coefficients of the wheat price index. 

We may now try an economic interpretation of these findings. It is 
somewhat doubtful whether the wheat price series really follows a sto¬ 
chastic relationship of the type of moving averages, but we may for the 
moment concede that this is so. Wold treats a stationary series, i.e., a 
series whose trend has been removed. The trend of the wheat price 
series is to be explained by other economic influences. Such factors are 
the price of gold, technological progress, and the rise of population. 

Apart from the trend, the wheat price of a given year appears as an 
average of random influences of the given year, one year back, and two 
years back. These random influences are probably the harvest fluctua¬ 
tions, which depend mostly on weather conditions. These weather 
conditions may not be strictly random, but can be expected to form more 
or less a random series. They will influence the supply. If (apart from 
a trend) the demand is more or less constant, then the wheat price will 
on the whole depend upon the supply. Under these conditions and 
taking the possibility of holding stocks into account, it appears not 



290 


INTERDEPENDENCE OF OBSERVATIONS 


[10.6 


unreasonable that the wheat price in a given year depends upon the 
weather conditions of the same year, the year before, and the second year 
before the year in question. 

Example 2. We apply now the ideas presented above to the American 
series of the quantity of meat consumed , 1919-41. There are N = 23 
items. They are deviations from a cubic trend (section 8.1, Example 2). 
We try to fit a moving average of length 2 [formula (1)] (/j = 2): 

(43) x t = e t + V<-i + V /-2 

We use the first two autocorrelation coefficients, r l = 0.089500 and 
r 2 = — 0.351299. All other autocorrelation coefficients are assumed to 
be zero. This assumption may be justified by the tests presented in 
section 10.1.2, Example 2. 

The polynomial u(x) (3) becomes now: 

(44) u{x) = 1 + 0.089500 (x + - 0.351299 (x 2 + 

We substitute z = x + \/x and have the polynomial v(z) from formula 

(4): 

(45) v(z) = 1.702598 + 0.089500z - 0.351299z 2 

The roots of this polynomial (45) are z t = 2.332334 and z 2 = — 2.077565. 
Both roots are outside the prohibited interval (— 2, 2). Hence it follows 
that a stationary process of the type of moving averages of order 2 exists 
with the given first two autocorrelation coefficients. 

By using formula (6) given above, we get the corresponding four roots 
of x. They are x x = 1.766084, x 2 = 0.566250, x 3 = — 1.319977, and 
x A = — 0.757589. Now we form the polynomials (x — .v,)(.v — x f )(i ^ j) 
[formula (7)] in order to find the coefficients of the moving averages: 

(46) (x - 1.766084)(.v - 0.566250) = ,v 2 - 2.332.V + 1.000 
The corresponding moving average is: 

(47) x t = e, — 2.332f,_i + € t -2 

(48) (x- 1.766084)(.v + 1.319997) = * 2 - 0.466a - 2.331 

The corresponding moving average is: 

x t = e, — 0.466f,_ 1 — 2.33 t _ 2 

(x- 1.766084)(.v + 0.757589) = a : 2 - 1.008*- 1.338 


(49) 

(50) 


10.6] 


CORRELOGRAM ANALYSIS 


291 


To this corresponds the following moving average: 

(51) x, = e t — 1 .00%£ i _ l — I.338f,_ 2 

(52) (x- 0.566250)(.y + 1.319977) = x 2 -f 0.754 a-- 1.501 
Here we have the following moving average: 

(53) a', = e ( + 0.754£ £ _j — 1.501e t _ 2 

(54) (x - 0.566250)(x + 0.757589) = x 2 + 0.19lx - 0.429 
We have the following moving average: 

(55) x, = f, + 0.191 e t . ! - 0.429f,_ 2 
Finally: 

(56) (x -f 1.319977 )(a" -f 0.757589) = x 2 -f 2.078x + 1.000 
The corresponding moving average is: 

(57) x, = e, + 2.01Se l _ l -f- £ t _o 


The six moving averages (47), (49), (51), (53), (55), (57) given above 
are the only ones which correspond to a stationary process of moving 
averages of order 2 with the two given autocorrelation coefficients. 

We apply now the large sample test indicated above. The matrix 
x u computed from formulae (12), (13), (14), (15) is of the form: 


1.262844 
0.1161 18 

- 0.694590 

- 0.062882 


0.1 161 18 
1.262844 
0.1 161 18 
- 0.694590 


TABLE 2 

- 0.694590 
0.1 161 18 
1.262844 
0.1 161 18 


- 0.062882 
- 0.694590 
0.1161 18 
1.262844 


Some elements of the triangular matrix [y fJ ] computed from formulae 
(16), (17), (18), (19), (20), (21) are indicated in Table 3. 

TABLE 3 

1.123833 0.000000 0.000000 0.000000 

0.103323 1.118928 0.000000 0.000000 • • • 

- 0.055953 0.108943 1.017139 0.000000 • • • 




292 


INTERDEPENDENCE OF OBSERVATIONS 


[10.6 


Finally, the matrix [z i} ] is given in Table 4. The quantities in this 
matrix have been computed with the help of formulae (23), (24), (25), 
(26), (27), (28). 

TABLE 4 

0.889812 0.000000 0.000000 

-0.082166 0.893713 0.000000 

0.057749 - 0.095723 0.983150 


We have from formula (31) for the test criteria R s : 



R 


h+i 


Z il r h+l + 


Z i2 r h+2 + 


Z i3 r h+3 


We give in Table 5 the empirical autoregression coefficients r L , the 
quantities R L computed from formula (58), the quantity x 2 computed 
from formula (32), and the quantity x 2 (k = L — 2) which has been 
computed by the use of formula (33). 


TABLE 5 


Lag L 

1 

2 

r L 

0.0895 

-0.3513 


y 2 

X 2 

3 

-0.1779 

-0.159 

0.506 

0.506 

4 

- 0.1876 

- 0.153 

0.446 

0.952 

5 

- 0.2724 

- 0.296 

1.577 

2.529 


None of the x 2 are significant. We may have on the 5 per cent level of 
significance a x 2 as large as 3.841 for 1 degree of freedom. For 2 degrees 
of freedom ( x 2 2 ) the permissible value is 5.991, and for 3 degrees (/ 3 2 ) of 


freedom, 7.815. 

Hence we may say that it is not impossible that the American meat 
consumption series follows a stochastic scheme of the type of moving 
averages. It should, however, be mentioned that our tests are all large 
sample tests and hence not strictly applicable to the short series analyzed. 

These results are of somewhat dubious validity because of the 
reasons indicated above. Besides, the analysis in section 10.1.2, Example 
2, has shown that the two first autocorrelation coefficients are not signi¬ 
ficant. Hence an analysis based upon them is not likely to be very 

accurate. 

If we concede for the moment some meaning to the results, then they 
have to be interpreted in terms similar to those used in Example 1. 

Example 3. We apply to another American series the method of 
finding a stochastic process of the type of moving averages from the 
information contained in the correlogram. We have the N = SB items 




10.6] 


CORRELOGRAM ANALYSIS 


293 


of the wholesale price index of all commodities, 1890-1947. The data 
are annual. They are deviations from a trend which is itself a moving 
average of variable length (section 8.2, Example 1). 

We want to find again a moving average of length 2 (h = 2). The 
first two empirical autocorrelation coefficients are r x — — 0.047737 and 
r 2 = — 0.103224. These two autocorrelation coefficients are not signifi¬ 
cant (section 10.1.2, Example 3). Our task is to find, if possible, the 
moving averages which have these first two autocorrelation coefficients 
and whose higher autocorrelation coefficients are zero. 

The polynomial u(x) [formula (3)] is now: 

(59) w(x) = 1 - 0.047737 (x + - 0.103224 (x 2 + -3 

From this we derive the polynomial v(z) [formula (4)]: 

(60) v(z) = 1.206448 - 0.047737z - 0.103224z 2 

The roots of the last polynomial are Zj = 3.657710 and z 2 = - 3.195250. 
None is in the prohibited interval (— 2, 2). Hence we conclude that a 
stationary process of the type of moving averages with length 2 exists, 
which has the given first two autocorrelation coefficients. 

The corresponding four roots of u(x) [formula (6)] are x x = 3.360194, 
x> = 0.297516, * 3 = 0.615292, * 4 = - 3.810542. 

We combine them again, taking two at a time [formula (7)], in order 
to derive the weights of the moving averages: 

(61) (x- 3.360194)(*- 0.297516) = * 2 - 3.658* + 1.000 
The corresponding process of moving averages is: 

(62) x t = e t — 3.658fi j _ 1 + e t _ 2 

(63) (x- 3.360I94)(*- 0.615292) = * 2 - 3.975* + 2.068 
This corresponds to the following moving average: 

(64 ) x t = e t - 3.975fi*_ 1 -F 2.068c t _ 2 

(65) (jc— 3.360194)(* + 3.810542) = * 2 + 0.450*- 12.804 
This solution corresponds to the following moving average: 

(66 ) x t = e t + 0.450^!- I2.804f,_ 2 

(67) (x - 0.297516)(* - 0.615292) = * 2 - 0.913* + 0.183 
This corresponds to this moving average: 

(68) x t = e t — 0.913^_! -f 0.183f t _ 2 

(69) (*- 0.297516)(* 4- 3.810542) = jc 2 + 3.513* — 1.134 


294 


INTERDEPENDENCE CF OBSERVATIONS 


[10.6 


This corresponds to the moving average: 

(70) jc f = £, 4 3.51 3e t _ x -1.134e < _ 2 

(71) (x- 0.615292)(* + 3.810542) = x 2 + 3.195 a: - 2.345 
This gives the moving average: 

(72) x t = e t 4 3.195e,_ 1 — 2.345e,_ 2 

These results are again of dubious validity. This is particularly the 
case because a number of autocorrelation coefficients of order higher 
than 2 have been neglected, in spite of the fact that they are statistically 
significant (section 10.1.2, Example 3). But the analysis would have been 
much more complicated if they had been taken into account. 

The interpretation is again similar to the one given above in Example I. 

10.6.2 LINEAR AUTOREGRESSION (STOCHASTIC DIFFERENCE 
EQUATIONS) 6 

If the time series x t is created by a process of linear autoregression 
(stochastic difference equation), we have: 

(!) x t + a i x t-i + • • • + a h x t _ h = e, 

(section 10.3). The quantities e t come here from a pure random series. 
Their mean is zero and their variance a 2 . They are not autocorrelated. 

Suppose that we have a set of empirical autocorrelation coefficients 
/*!, r 2 , • • • r h and want to determine relation (1). Then it can be shown 
that the unknown parameters a i must obey the following relations: 

a \ + '1*2 + * - ’ + r h-\<* h = — r i 

Wi 4 a 2 + • • • + r h _ 2 a h = - r 2 

( 2 ) 


r u-\ a \ + r h _ 2 a 2 4 • • * + a h = — r h 

If we know the empirical autocorrelation coefficients r l9 r 2 , • • • r h 
we can evidently determine the constants a ly a 2 , * * * a n from the linear 
system (2). This gives consistent estimates (section 5.1). All other 
autocorrelation coefficients can be computed from the formula: 

(3) r L = a x r L _ x 4 a 2 r L _ 2 4 * * * 4 a h r L _ h 

6 H. Wold: A Study in the Analysis of Stationary Time Senes (Uppsala, 1938), 
pp. 103 ff., 178 ff. See also G. H. Orcutt: “A study of the autoregressive nature 
of the time series used for Tinbergen's model of the economic system of the 
United States, 1919-1932," Journal of the Royal Statistical Society, vol. 10(1948), 
supplement, pp. 1 ff. 




10.6] 


CORRELOGRAM ANALYSIS 


295 


We form the characteristic equation: 

( 4 ) z h -f a x z h ~ l + • • • 4 - a h = 0 

The modulus of the roots of this equation (4) must be less than unity , if 
there is a stationary stochastic process of form (1) with the coefficients 
a \> <**> • ' * a h . 

The usefulness of this method is again impaired by the neglect of 
errors of observations and also by the unreliability of the empirical auto¬ 
correlation coefficients. It may be suspected that the methods of section 

10.3.3, which are based upon the application of the principle of maximum 
likelihood, are preferable. 

A large sample test oj the goodness of fit for autoregressive schemes 

has recently been given by M. H. Quenouille. 7 We present his results 
for the cases h = 1 and h = 2. 

Let us assume that a first-order difference equation (autoregressive 
scheme with h = I (section 10.3.1) has been fitted with the use of the 
first autocorrelation coefficient r x . Then we compute the test functions: 

^ ^a- — r s ~ 2 r 1 r ,_ 1 4 r x 2 r 8 _ 2 (s = 2, 3, • • •) 

It can be shown that they are independent of the first autocorrelation 

coefficient in the population and also mutually independent. They are 

also in the limit normally and independently distributed with mean zero 
and variance: 


( 6 ) 

The variables: 

(7) 






are in the limit distributed like * 2 with I degree of freedom 
a test. 


This provides 


A more comprehensive test is the following: 


( 8 ) 


k + 1 0 2 

x * 2 =I~ 

8 = 2 V s 


reor. M ’ C ? uenou '! le: " A lar 8 e sample test for the goodness of fit for auto- 
regresswe sehemes Journal of the Royal Statistical Society , vol. 110 (1947), 

Pont~ f « ' M WalkCr: " Note on a generalisation of the large sample 
goodness of fit test for autoregressive schemes," ibid., series B, vol. ^2 (1950) 

PP; 02 ff ' M ' S Bartlett and P H. Diamanda: “Extension of Quenou, lie's 
test for autoregressive schemes,” ibid ., pp. 108 ff. 



INTERDEPENDENCE OF OBSERVATIONS [10.6 

This quantity is in the limit distributed like x 2 with k degrees of freedom 

This provides a large sample test for the adequacy of the fit of an auto¬ 
regressive scheme with h = I. 

Suppose that we fit an autoregressive scheme with h = 2 (second-order 

stochastic difference equation; see section 10.3.2). We form the quan¬ 
tities: ^ 


(9) 


R,= 


r s +2 + 2a i r s+i + (a 2 + 2 a 2 )r s + 2 a 1 a 2 r s _ l + a 2 2 r,_ 2 

(s = 3, 4, ■ ■ •) 

These quantities have properties similar to those of the test functions 

discussed above. They are in the limit independent and normally 
distributed with variance: 



The quantity: 



( 1 ~ * 2 ) 2 [( 1 + atf-a 

(1 + a 2 )\N - s) 



is for large samples distributed like % 2 with 1 degree of freedom. 
A more comprehensive test function is: 



This quantity is in the limit distributed like y 2 with k degrees of freedom. 

Example 1. Wold applied these methods to Myrdahl’s cost of living 

index in Sweden, 1830-1913. 8 He deals again with deviations from the 
trend. 

The empirical autocorrelation coefficients are shown in Table 1. 


TABLE I 


Lag L 
1 

2 

3 

4 

5 

6 
7 


Empirical Autocorrelation Coefficients r L 

0.5216 

- 0.2240 

- 0.581 1 

- 0.4626 

- 0.0963 
0.2035 
0.3138 


8 H. Wold: A Study in the Analysis of Stationary Time Series (Uppsala, 1938), 
pp. 174 ff. G. Myrdahl: The Cost of Living in Sweden , 1830-1930 (Stockholm, 
1933). 



10.6 ] 


CORRELOGRAM ANALYSIS 


297 


TABLE I — (cont'd ) 


Lag L Empirical Autocorrelation Coefficients r 


8 

0.2613 

9 

0.1434 

10 

- 0.0034 

11 

-0.1533 

12 

- 0.2530 

13 

- 0.2254 

14 

- 0.0042 

15 

0.1883 


L 


The empirical autocorrelation coefficients show the typical behavior, 
a process of linear autoregression. They are damped harmonics. 

Wold takes h = 2. Using only the first two autocorrelation coefficients 

= 0.5216 and r 2 = — 0.2240, we have the linear system (2) for the 
determination of constants a, and a 2 : 

(13) 1.0000a, 4 0.5216a 2 = - 0.5216 

0.5216a, 4- 1.0000a 2 = 0.2240 

This gives the solutions a, = - 0.8771 and a 2 = 0.6815. The charac¬ 
teristic equation (4) is: 

< 14 ) z 2 - 0.877lz + 0.6815 = 0 


Its roots are z = 0.4 385 ± 0. 6994/; / = V - 1 is the imaginary unit. 

Them modulus is V0.4385 2 4 0.6994 2 = 0.824. This is less than unity. 
Hence we may conclude that a stationary stochastic process of linear 
autoregression with the given first two autocorrelation coefficients exists: 

( 15) -xr, — 0.8771 4 0.68 1 5jc ( _ 2 = e, 

The higher autocorrelation coefficients are computed from the formula: 

(16) r L = - 0.8871 + 0.6815/-^ 

Wold is not satisfied by the fit and tries to improve it in various ways. 

An economic interpretation of the results suggests itself in the following 

orm: The relationship derived means that in the economic life of Sweden 

there are influences which produce lags (section 10.3.5). These are 

anticipations and expectations, the period of production of many com- 
modities, etc. J 

Apart from the deterministic influences which produce the structure 
ot the system, as represented by equation (15) without random element 
ere are also random influences, which are represented by f, These 
influences are weather conditions, wars, strikes, accidents of various sorts 



298 


interdependence of observations 


[10.6 


genuine errors and mistakes in economic life, etc. Hence the appearance 

of a relationship like (15) for the cost of living index in Sweden is not 
implausible. 

Example 2. We will illustrate the use of the information provided by 
the correlogram in order to approximate a process of linear autoregression. 
We use the N = 23 items of the quantity of meat consumed , 1919-41. 
Since we want to achieve a stationary process we use the deviations from 
a cubic trend (section 8.1, Example 2). 

Let us assume the autoregressive process as linear and of order 2 

(h 2). This corresponds to a second-order stochastic difference equa¬ 
tion (section 10.3.2): 

0^) Xf -j- QiX t _i -f- a 2 x t _2 =■ € t 

The two first empirical autocorrelation coefficients are t\ = 0.089500 

and r 2 — 0.351299. Hence the constants a x and a 2 can be computed 

from the linear system (2): 


/)OX 1.000000^ -f 0.089500a 2 = - 0.089500 

( 18 ) * 

0.089500^ -f I.OOOOO0 2 = 0.351299 

This gives the solutions a Y = - 0.121918 and a 2 = 0.362210. Hence the 
fitted process of linear autoregression is: 

( 19 ) - 0.121918*,.! -f- 0.36221 0x t _ 2 = e r 
We form the characteristic equation (4): 

(20) z 2 - 0.121918z + 0.362210 = 0 


This equation has two conjugate complex roots, z Y = 0.060959 4- 

0.598748/ and z 2 = 0.060959 — 0.598749/; / = V — 1 is the imaginary 

unit. The modulus of the roots is V (0.060959) 2 4- (— 0.598749) 2 == 0.601. 
This is less than 1. Hence there exists indeed a stationary stochastic 
process of linear autoregression (stochastic difference equation) of order 2 
[formula (19)] with the two given empirical autocorrelation coefficients. 

We apply now the large sample test indicated above. We have from 
formula (9) for the quantities R s : 

(21) R s = r s+2 — 0.243836r s+1 4- 0.739284^-0.088320^ 4- 0.131196r 5 _ 2 


Also the variances V s are according to formula (10): 


( 22 ) 


0.861845 



23- s 



10.6 ] 


CORRELOGRAM ANALYSIS 


299 


We give in Table 2 the quantities necessary for the test: the empirical 
autocorrelation coefficients r L \ the quantities R L computed from formula 
(9) above; their variances V L [formula (10)]; the quantities x 2 from 
formula (11); finally the test functions x 2 (k = L — 2) computed accor¬ 
ding to formula (12). 


TABLE 2 


Lag L 

r L 


1 

0.0895 


2 

-0.3513 


3 

- 0.1779 

-0.315 

4 

-0.1876 

-0.118 

5 

- 0.2724 

0.034 

6 

- 0.0154 


7 

0.2382 



z 2 X 2 


0.043 2.308 2.308 

0.045 0.309 2.617 

0.048 0.024 2.641 


Not one of the criteria x 2 a "d x 2 is significant. On the 5 per cent level 
of significance we may have a x 2 as large as 3.841 for 1 degree of freedom. 
For 2 degrees of freedom the permissible value is 5.991, and for 3 degrees 
of freedom 7.815. Hence we may conclude that our test gives negative 
results. The approximation by the linear scheme of autoregression (19) 
is reasonably good, as judged by our tests. 

Our series is really too short to give us much confidence in our results. 

The tests used are large sample tests, which should not be applied to such 
short series. 


If the results have some validity then they have to be interpreted in a 

fashion similar to that used with Example I. The data have been analyzed 

in Example I, section 10.3.6, under the assumption that there are super- 
imposed errors of observations. 


Example 3. Our data is now the series of the 

price index of all commodities . 1890-1947. The data 

a trend which is itself a moving average of variable 
Example I). 


American wholesale 
are deviations from 
length (section 8.2, 


We want to find again a stationary process of the type of linear auto¬ 
regression of extent 2 (h = 2). The first two empirical autocorrelation 
coefficients are r, = - 0 047737 and r 2 = - 0.103224. 

The following system of linear equations (2) gives estimates of the 
coefficients a, and a 2 : 


( 23 ) 


I-000000a, - 0.047737a 2 = 0.047737 
0.047737o, 1.000000 a 2 = 0.103224 



300 


INTERDEPENDENCE OF OBSERVATIONS 


[ 10.6 


From this system we have for our solution a x = 0.052785 and a 2 = 
0.105744. Hence the linear process of autoregression ( 1 ) is estimated as: 

< 24 ) x, + 0.052785.x,_ x + 0. 105744a-, _ 2 = e, 

In order to investigate whether the process is indeed stationary we 
form the characteristic equation ( 4 ): 

(25) z 2 + 0.052785z + 0.105744 = 0 

The two roots are again conjugate complex: z l = 0.026393 + 0.324114/ 
and r 2 = 0.026393 — 0.324114/; / is again the imaginary unit. 

The modulus of the roots is V(0.026393) 2 + (0.324114) 2 = 0.326. 

This is less than 1 . Hence we may have indeed a stationary process of 
order 2 . 

There are again some grave objections to the validity of our proceedings. 

They have to be interpreted in the same fashion as Example 1 in this 
section. 



Chapter 11 


The Transformation of Observations 


The methods presented in Chapter 10 are ideally suited to the analysis 
of data which have the form of time series. But, practically speaking, 
they have still many shortcomings, which have been pointed out above. 

The econometrician may prefer other methods if faced with practical 
problems. He may, for instance, transform his observations in such a 
fashion that they become mutually independent. This can be done by 
using trends (section 11.1), by taking differences (section 11.2), or by 
autoregressive transformations (section 11.3). 

The transformed observations are then presumably mutually inde¬ 
pendent. We may use some of the procedures which have been indicated 
in Part 2. 

But if we have no a priori information about the stochastic scheme 
underlying our data, we have to estimate from our observations the 
transformation which makes our data mutually independent. These 
estimates will not be too reliable with the series of medium length which 
we usually have with economic data. Hence the validity of the methods 
presented in this chapter is somewhat doubtful. 

11.1 Trends in Multiple Regressions 

11.1.1 LINEAR TRENDS 

It is the purpose of trend elimination to reduce or possibly eliminate 
the mutual dependence of successive terms of economic time series. If 
this is successful, then traditional statistical procedures can be used to 
deal with the deviations from the computed trend. 

A very important theorem was proved by R. Frisch and F. V. Waugh: 1 


1 R. Frisch and F. V. Waugh: “Partial time regression as compared with 
individual trends," Econometrica , vol. I (1933), pp. 387 ff. H. Working and 
H. Hotelling: “Applications of the theory of error to the interpretation of 
trends," Journal of the American Statistical Association , vol. 24 (1929), supple¬ 
ment, pp. 73 fT. 


301 


302 


THE TRANSFORMATION OF OBSERVATIONS 


[//./ 


Suppose that we consider the multiple regression of X 1 on the set of 

independent variables X a , ■ ■ ■ X n _ v But each of the variables 

involved, X x , X 2 , ■ ■ ■ X n _ x , has a linear time trend. We denote time 

by X n . Then we may compute the individual time trends by the method 
of least squares (section 5.1): 

(1) X it = L i0 + L { X nt (/ = 1, 2, • • • n — 1; t = 1,2,* • • TV) 
The time regression coefficients are given by: 




(i = 1,2,* • • n — 1) 


We denote again the sums of squares and cross products of the devia¬ 
tions of the variables X lt X 2 , • * • X n from their means by S (j [formula (2) 

of section 5.1]. The result (2) follows from the classical method of 
least squares. 

We find the deviations of our variables X lt X 2 , • • • X n _ x from the 
fitted linear trends (1) as follows: 


(3) x 




(/ = 1, 2, • • • n — 1; t = 1, 2, • • • N) 


The deviations .v it will be more or less independent if the original series 
consisted of a linear trend and a superimposed random element. 

We want now to find a multiple regression of x\ on the variables 
x 2 » x 3> ’ x n—i* All these are now deviations from the linear trends: 

(4) - v it = k 2 X 2 1 + k'zx'u + * * * 4- k' n _ l x n _ lt (/ = 1, 2, • • • N) 

Denote the sums of squares and cross products of the variables freed 
from linear trends by S' tj . We have evidently: 




S, „ S, „ 



It follows from the method of least squares that the system of normal 
equations [formula (5) of section 5.1] for the determination of the regres- 


( 6 ) 


C' L' 
° 22 ^ 2 

C' L' 
o 23^ 2 


' 2 . • • • 

k'n -1 

is as follows: 


-f- 5* 23 k 3 

4- * 

• • + S' 2tt _ i k'„_ x 

= 5' 

T S 33 ^ 3 

4- • 

• • • 

. . 1 C' b' 

0 3,m- \ k n- 1 

= 5' 

• • • 

+ 3 

4- * 

. . 1 C' k' 

> 0 /i-1,;i- I* / 1 - 1 

- S' 


12 


13 


Let us now compare this situation with the following alternative proce¬ 
dure: We compute the multiple regression of X x on the dependent 




TRENDS IN MULTIPLE REGRESSIONS 


303 


//./] 


variables X 2 , X 3i - • • X n _ lt X n . This is to say, we introduce time ( X n ) 
explicitly into our regression equation: 


(7) X lt — k 0 -j- k 2 X 2t -f k 3 X 3t 


F k n _ l X n _ ll -j- k n X nt 

U = 1 , 2 , 


N) 


The method of least squares leads to the following well-known system 


of normal equations: 





S 22 k 2 -f S 23 k 3 

-F ' 

“F S 2t n- 

■F S 2 n k n 

11 

y) 

ro 

$23^2 “F *^33^3 

4- * 

~F ^3,/i-l^n-l 

~F S 3 n k n 

~ *^13 

(8) . 

• • 



• • • 

~F S 3 n _ik 3 

4- * 

“F Sn— 1 ,/i - 1 ^„ 

i ~F S n _ X tl k n 

= S\,H 

S271^2 ~F S 3n k 3 

4- • 

. . _1_ C Lr 

“F S„ >n k n 

= s ltH 


From this system of equations we eliminate now k n by multiplying the 
last equation of system (8) by S in and dividing it by S nn . This gives: 


(9) 


(%^=) *2 + (%^) *3+*'-+ k„_, + S,„k n 


n Si „ 


n n 


If expression (9) for / = 2, 3, * • • n — 1 is subtracted from the first 
(n — 1) equations of system (8), we obtain again system (6), because of 
the relation (5). 

Hence the following theorem is true: Multiple regression of a set of 
variables which are deviations from linear time trends is equivalent to 
introducing time explicitly into the regression equation. The second 
method is much simpler computationally. 

Example 1. We exemplify the theory given above by the fitting of 
a production Junction for American agriculture} The data are yearly; 
the period 1920-41. We neglect here the fact that our data are actually 
the result of the interaction of various economic relationships (sections 
6.5 and 7.1). 

Our variables are X x , logarithm of the volume of agricultural produc- 
tion; X 2 , logarithm of employment in agriculture; X 3 , logarithm of 
agricultural operating capital; X 4 , time, origin between 1930 and 1931. 
The sums of squares and cross products of the deviations of our variables 
from their means are given in Table 1. 


1 G. Tintner: “An application of the variate difference method to multiple 
regression," Econometrica , vol. 12 (1944), pp. 97 fT. 




304 


THE TRANSFORMATION OF OBSERVATIONS 


[//./ 


TABLE 1 

Matrix of Sums of Squares and Cross Products 





^3 

*4 

*1 

"0.032342184 

- 0.005818911 

0.14308161 

3.526000000 ' 

X2 


0.02954490 

- 0.001415043 

- 1.393700000 




0.024221925 

- 0.3500300000 


_ 



995.500000000 


We fit a linear regression equation (7) which includes time (* 4 ) as one 
of the variables. The result is: 


(10) X 1 = — 0.1870 + 1.6983*2 + 0.8128* 3 + 0.0070* 4 

The regression coefficients 1.6983 and 0.8128 are the elasticities of 
agricultural production with respect to employment and operating capital. 
The regression coefficient 0.0070 is an exponential trend in agricultural 
production. It represents a yearly increase of about 1.6 per cent in 
agricultural production. 

Equation (10) gives the same estimates of the elasticities as the estimates 
we would have obtained under the following circumstances: Suppose 
that we had first computed the linear trends of the logarithms of agricul¬ 
tural production (*j), employment ( X 2 ) y and operating capital ( X 3 ). 
This could have been accomplished by finding the linear regressions of 
*i, X 2 , and X 3 on time (* 4 ). The linear regressions of the logarithms 
represent actually the determination of exponential trends. 

If we then had found the regression between the deviations of the 
variables from these linear trends, we would have obtained the same 
estimates of the elasticities as given above. The procedure followed above 
is, however, much simpler. 

I 1.1.2 POLYNOMIAL TRENDS 

It is not difficult to extend the results indicated above to trends repre¬ 
sented by orthogonal polynomials (section 8.1) or in fact to any orthogonal 
functions. Let us define a set of orthogonal functions (/ = I, 2, • • • N): 

(1) f = 0 (/= 1,2, • • -p) 

t = i 

(2) 2 f 2 »7 = 1 (/ = 0, 1, • • • p) 

t= 1 


(3) 


N 

7 t t 
z, 1 jt 

t = i 


o if / =* j (/, i = o, 1 , • • • p) 



//./] 


TRENDS IN MULTIPLE REGRESSIONS 


305 


We have a set of n variables X it (/ = 1,2,* • • n\ t = 1, 2, • • • N). 
These variables do not include time. We first fit the normalized and 
orthogonal functions to each of these variables: 


(4) 


v 


Xit = 2 Ai£ it (/= 1 , 2 , 

3 = 0 


• • 


n\ t = 1 , 2 , 


• • 


N) 


The degree of the polynomial to be fitted (p) may be found by methods 
indicated in section 8 . 1 *. 

Let us introduce the notation: 


(5) 


N 


T ij — 2 X it S jt o = 1 , 2, • • • n; j = 0, 1, • • • p) 

t= i ' 


Then it follows from the orthogonality properties of the functions that 
we have: 


( 6 ) 


A u = T„ 


(/ = 1 , 2 ,- • • //; j = 0 , 1 , • • • p) 


This gives us the polynomial trends. They have been computed by the 
classical method of least squares (section 5 . 1 ). 

We find the deviations of each variate from its fitted value: 


//; t = I, 2, • • • N) 


(7) x ' tt = X u - V (1=1,2, •• 

j =0 

The regression of x lt on x 2t , x' 2t , • • • x nt is: 

^ Y i t — k 2 X 2/ T- k 3 X 3t 4~ * • 4- k' n x nl 

We introduce the sums of the squares and cross products of the devia¬ 
tions: 

(9) S u = 2 x lt x jt = S u — 2 A ik A jk (/,y = 1,2,* • • //) 




A = 1 


The system of normal equations for the determination of the regression 


coefficients k\, k\ y 


k' n is: 


( 10 ) 


^ 12 — k 2 S 22 4- k 3 5 23 T * 
^ 13 — k 2 S 23 : 3 5 ' 33 j • 


: k'nS' 3ll 




s in — k 2 S' 2n 4 - k' 3 S' 3tl 4 - • • * 4 

Alternatively, suppose we just include a set of p orthogonal functions 
into our regression equation: 

(II) X u = k 0 + k 2 X 2l ■ k 3 X M + • • • + k„X„ 1 + + fl 2 £ 2( + 

' ' • + 

V - I, 2, • • • /V) 




306 THE TRANSFORMATION OF OBSERVATIONS [//. 

The system of normal equations is now: 

‘ s ’ 12 = + k * S ™ + ■ • • + k nS 2n + B,T 2l + b 2 t 22 + • • • + B„T 2p 

= k * S ™ + k *&* + • • • + k„S 3n + B,T 31 + B 2 T 32 + • • • + B v T 3p 

( 12 ) 5l ” = k2S2n + k3Sa " + ' ‘ + k " S ™ + Wm + B 2 T n2 +- ■ ■ + B v T nv 
T n = k 2 T 2l + Jc 3 T 31 + • • • + k n T nl + B l 

T i2 = k 2 T 22 + k 3 T 32 4 - • • * 4 - k n T n2 4 - B 2 


T w- k 2 T 2j) + k 3 T 3p -\ -4- k n T np + B p 

With the help of the last p equations of this system (12) and also using 
equation (9), we are able to transform system (12) into system (10). 
Hence the use of deviations from a set of orthogonal functions is equiva¬ 
lent to the introduction of these functions into the multiple regression 
equation. 

This method may be used under the following circumstances: ( a ) if 
there is a polynomial trend in all our variables (section 8.1); ( b) if the 
variables have systematic parts which may be approximated by a Fourier 
series (section 9.1). We want to find the multiple regression of the 
deviations of our observations from the fitted functions. We believe 
that in this way we can reduce the mutual dependence of successive 
observations. Then we may just introduce the functions to be fitted 
into the regression equation. The method of least squares will give us 
the same result as the more complicated analysis of all the deviations 
from the fitted functions. 

Example 1. We use the method of orthogonal polynomials to find 
a demand and a supply function of meat for the United States. We use 
data from the period 1919-41; N = 23. The data are annual. (See 
section 7.2, Example 1.) V 1 is quantity of meat, Y 2 price of meat, 
Z 2 income, and Z 3 cost of producing meat. The means of the variables 
are given in Table 1. 

TABLE 1 

Arithmetic Means 


Symbol 

Variable 

Mean 

n 

Quantity of meat 

166.1913 

y* 

Price of meat 

92.3391 


Income 

495.5652 

z 2 

Cost 

88.4217 





//./] 


TRENDS IN MULTIPLE REGRESSIONS 


d 


t 
13S 3 


+ b 


10 


307 

.arifbte “we™ ,hal 'T r " r0rs e ’“ a, '» ns »“• »< i» .he 

» * Th», 7‘ ,hat ,here *" cubic irends in demand and 
su ppiy. Then the demand equation is: 

(l3) y i = * 1 * Y 2 + c„Z, + </„£,' + d^ 2 ' H 
The supply equation is: 

(l Th u 7 t bi2 Yi + + d ' n ^' + d ^ 2 ' + d ^ 3 r "20 

e symbol { is an orthogonal polynomial of degree /. We assume 

represen^'powers oftLT "7"°" ^ poI y nomials which 

and (14) is just identtfied (Action TJf" 0 ” ^ ^ ° 3) 

^ ™ bya 

( ' 5) - Vl 0 0032742, - 0.239761z 2 - 0.2677372,' + 0 115026f ' 

+ 0.102069^3' °^ 2 

n6) ; ' 2 = ° —^0 056*170' 426 ° 3Za + 0-2673201,' - 0.0580812,' 

, 2 from thC meanS - '* " e "“ 

(section 5.2) of the ^demand functionf ^ d maX,mUm l,kellhood estimate 

" 7,n ■ -VSS&Z » + 

■rend" ‘' 7) now a '“ bi « 

increase of .he demand r„ r me.f.'n ihe'perS P ' esumabl > a ^ 

e I a s nbn y "i s'T. n o' Qdd j 6 9 taco m" d l'" COme elaS,iC ' ti ' S al lhc "“*»». price 
■I* results of «1 ZZT.T'Z’ ^” 2 ' 32 - U ‘ “ “»»• ■ ■». 
to those during the period ’ . d C ,° nd,tlons are approximately similar 

I Per cent. The^XTl T ThC pnCe ° f meat -creases by 

to decrease by about 7 / G f \ ‘ 7' ^ m ?c CXpect the de mand for meat 

increases by 1 per cemanl 7 T' " °" the ° ther hand the income 

expect an increase in the demand fl ' nS remams the same > we may 

Similarly, we may ehmmaT mcomeTz ) / ^ V » ° f ' per Cent ' 

and (16). Then we obtain an annr ^ h ° m ‘ he tW ° ec ) uati ons (15) 
meat: Pproximation to the supply function of 


(18) Y. 


^ M5M,V H 2 iS 9 4§8~ °' 26 ° 682fl ' + 011 34922./ 



308 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 


The price elasticity of the supply of meat is negative, namely — 0.014665. 
This is, however, so small that it may very well be zero in the population. 
Hence it is just imaginable that the supply of meat is independent or 
nearly independent of the contemporaneous price of meat. 

The demand and the supply function of meat are the same as the rela¬ 
tionships we would have found if we had proceeded as follows: We 
eliminate first the cubic trends of our variables Y l9 Y 2 , Z l9 Z 2 . Then we 
use the method of least squares to find the reduced form equations, i.e., 
express the deviations of the endogenous variables Y x and Y 2 from their 
cubic trends in terms of the deviations of the exogenous variables Z x and 
Z 2 from their cubic trends. 

The validity of the results should not be overestimated. There are 
reasons to doubt whether we can approximate the trends of our variables 
by cubics and in this way produce sufficient independence of successive 
deviations from these trends. Actually, the trends may be more compli¬ 
cated functions. Dynamic factors (lags, etc.) have been neglected. 

It is also not quite legitimate to consider the meat market in isolation. 
Especially, there will be strong interrelationships between the demand and 
production conditions of meat and other foods. These have been com¬ 
pletely neglected. It would have been more satisfactory to construct a 
model of the total economy (sections 3.7 and 3.8) and consider the market 
of meat as a part of this more comprehensive model. 

11.2 Variate Difference Method 1 

In this section we deal with a method which transforms the observations 
by taking finite differences. In this way we may be able to estimate the 
variance of the error component (sections 11.2.2 and 11.2.3). The 
methods available for this procedure involve again a multiple choice 
problem (section 1.2). 

Having estimated the order of the difference in which this is accom¬ 
plished, we may use the method of moving averages (section 8.2) to 
accomplish an approximation to the non-random part of the series 
(section 11.2.4). 

It is somewhat doubtful whether this method is really widely applicable. 

It is certainly not valid if the series follows a stochastic scheme of linear 
regression (stochastic difference equation, sections 10.3.6 and 10.6.2). 

Applications of the variate difference method to the estimation of the 
error variance have been given in section 6.5, which deals with the 
problem of errors in the variables. 


1 G. Tintner: The Variate Differenee Method (Bloomington, Ind., 1940). 



VARIATE DIFFERENCE METHOD 


309 


11.2] 

11.2.1 THE ASSUMPTIONS 

Let the observed item be X, (t = 1, 2, • • • N). But this observed 

value X t consists of two additive parts: the systematic part M. and the 
random element or error e .: 


( 1 ) 


— M t -f- e t (/ = 1,2, 


N) 


The random element e, is not autocorrelated. It has mean zero and 

vanance a . For some of the tests to be described below we must assume 
that it is normally distributed. 

Different methods of analysis are appropriate for various assumptions 
about the systematic parts A/,. If A/, is a polynomial, we may fit ortho¬ 
gonal polynomials to the observed series X, (section 8.1). If A/, is 
periodic with known period, we may approximate A/, by fitting a Fourier 
series to the data (section 9.1). If A/, is periodic with unknown period 
we may use penodogram analysis to find the period of M, (section 9 2)’ 

par« m ^( a sec 0 t ion k 8 e 2 r ^ m ° Vmg * appr ° ximate the *y*tema<ic 

saJ, h v e ; n a ;' a ' e difference method is based U P°" * desire to avoid unneces¬ 
sarily specific assumptions about the functional character of M . h j s 

not p °ssi e with the systematic parts of economic data to say definitely 
hat over a period of years they can be exactly represented by polynomial 

SelyTaTThe ! i0nS ' ° r ^ ^ ° f functions - But d “ems 

unctions Bv ' par ‘ S M < be <* 'he Cass of "smooth” 

[unctions. By this we mean the following: 

The autocorrelations (section 10.1) of M t and A/, ( s > /) are either 
positive, zero, or small negative numbers. This excludes a “zigzag” 

suThTa^hi^ th d e . systen ; atic P art The individual items cannot be 

Henri 8 ° W 65 f ° IIOW each ° ther in a P r etty regular fashion 

orono ^ H US ' CXC Ude fr0m our analysis certain monthly data with very 
prono Unced seasonal variations.* In this case the variate different 

e e bv mo 0t dPPhCable - We have f,rst to re move the seasonal variation 

(936), pp. 37 ' ff Berechnun S und Ausschaltung von Saisonschwankungen (Vienna 



310 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 


we can eliminate or at least greatly reduce the systematic part M t by 
taking finite differences. 

11.2.2 LARGE SAMPLE TEST 

We need now a test for the following decision: In which finite differ¬ 
ence series has the systematic part M t been sufficiently eliminated so that 
we may say with some confidence that this difference series and all differ¬ 
ence series of higher order represent approximately the random element 
€, alone? 

Let us assume for the sake of simplicity that the random element e t 
comes from a normal population with mean zero and variance o 2 . We 
make also the assumption that the e t are not autocorrelated and that 
they are not correlated with the systematic parts M t : 

(1) Ee, = 0 (/ = 1, 2, • • • N) 

( 2 ) Ee t e s = 0 i f t s (s, I = 1, 2, • • • N) 

(3) Ee ( 2 = a 2 (t = 1, 2, • • • TV) 

(4) EM t e s = 0 (s, t = 1, 2, • • • N) 

A large sample test has been given by O. Anderson 3 on the basis of 
earlier work. We want to test the hypothesis that the variance of the 
difference series of order k (properly reduced) is approximately equal to 
the variance of difference series k + I. The variances are computed in 
the following way: 

Z (*,- X) 2 

,5, „„ _ <*—- 

This is the sample variance of our empirical observations. The quantity 

N 

X = 2 XJN is the sample mean. 

/ = i 


3 O. Anderson: Die Korrelationsrechnung in der Konjunkturforschung (Frank¬ 
furt, 1929). R.Zaycoff: “Ueber die Ausschaltung der zufaelligen Komponente 
nach der Variate Difference Methode,” Publications of the Statistical Institute for 
Economic Research, State University of Sofia,\ ol. 1 (1937), pp. 75 ff. G.Tintner: 
op. cit., pp. 67 ff. J. N. Beretoni: The Variate Difference Method (unpublished 
thesis, Ames, Iowa, 1938). H. Strecker: “Die Quotientenmethode, eine 
Variante der Variate Difference Methode," Mitteilungsblatt fiir mathematische 
Statistik, vol. 1 (1949), pp. 115 ff. 



VARIATE DIFFERENCE METHOD 


311 


11.2] 


Let us denote by A k X, the Ath finite difference of X 
for the estimates of the variances of the difference series: 


Then we have 


K-l 


2 (A*a,' 2 v , 

(A' - k) 2 ,C k = I'l-i^X,) 2 ] ■ A kN 


( 6 ) 

where: 

(7) 


L, = 


2k£k ~ 


(2k)\ 


(At!) 


\\2 


is a binomial coefficient. It is the number of the combinations of 2k 
things, taking k at a time. 

It can be shown that with a pure random series the variances of the 

successive difference series increase proportionally to the binomial 

coefficients 2 ,C„ The quantities A KN are tabulated in the author's book 
/he Variate Difference Method , 4 

If the systematic part is eliminated in the finite difference series of order 
«o, we ought to have approximately: 

W v — y _ y — 

*• K *.+i ~ — * * 

In order to test the approximate equality of V, and 


the standard error of V k¥x — V k : 
(9) 


vve compute 




H A;V 


(A: = 0, I, 2, • • •) 


«,,v is a quantity which can be computed from formulae given by 

Zaycoff. on the basis of the work of O. Anderson. These quantities 

are tabulated m the author's book, The Variate Difference Method « 

for large values of N and for k > 6 we have the following asymptotic 
formula due to Anderson: 7 B ^ympiouc 


( 10 ) 


e,. 


2 


(3 k -f \)y k W2nk 

2( 2k + 1) 3 (N — k — I) 


(k = 6, 7, 


) 


We proceed as follows: We compute the quantities: 

= _ ( y * - 


(II) R, 


(k 0, I, 2, • • •) 


C A . J/ 

mean 1 zero anH “T’” appr ° Ximatel V normally distributed 

mean zero and variance I. 


with 


* G T'ntner: op. dr., pp. 43 fT. 

' R. Zaycoff: op. dr. 

° G. Tintner: op. dr., pp. 57 ff. 
O. Anderson: op. dr., p. | 13 . 




312 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 


Assume that we find a k 0 such that R ko+1 is significant, but R ko is not 
significant, from the point of view of a given significance level. Then 
we may assume that the systematic part of our empirical series has been 
approximately eliminated in the £ 0 th difference series. 

Hence the variance of the k 0 th finite difference series V K is an estimate 

of a 2 , i.e., of the variance of our random element e t in the population 

which corresponds to our sample. It should be pointed out that the 

choice of the order of the difference k 0 is again a multiple choice problem 

(section 1.2) and that the procedure outlined above is not entirely satis¬ 
factory. 

The theory presented here is not actually restricted to normal distri¬ 
butions of the random element e ( . Formulae and tables are also given 
in the author s book, The Variate Difference Method ', which deal with 
a random element which is not normally distributed. 8 These formulae 
involve the fourth moments of the observations and the differences. 
They will be omitted here. 

The variate difference method gives an estimate of the population 
variance of the random element which is of course not completely efficient. 
Following Fisher, 9 we may compare the efficiency of estimating the 
variance in a pure random series by V 0 [formula (5)] and estimating it 
by ^k-o in the variate difference method. This problem has recently been 
treated by A. P. Morse and F. E. Grubbs, 10 who also provide a table for 
evaluating the efficiency for various N and k 0 . 

These methods reveal the relative inefficiency of the estimates of the 
variance of the random element if it is based upon difference series. The 
efficiency is the smaller the higher the order of the series of differences 
from which the variance of the random element has been estimated. 

There are of course some other difficulties connected with the application 
of this method. It is often doubtful whether the stochastic scheme (1) 


8 G. Tintner: op. cit. y pp. 51 ff. 

9 R. A. Fisher: “On the mathematical foundations of theoretical statistics,” 
Philosophical Transactions of the Royal Society , series A, vol. 222 (1921), pp. 
309 ff.; “Theory of statistical estimation,” Proceedings of the Cambridge Philo¬ 
sophical Society , vol. 22 (1925), pp. 285 ff.; Statistical Methods for Research 
Workers (10th ed., Edinburgh, 1946), pp. 12 ff., section 3. M. G. Kendall: 
The Advanced Theory of Statistics , vol. 2 (London, 1946), pp. 5 ff. 

10 A. P. Morse and F. E. Grubbs: “The estimation of dispersion from differ¬ 
ences,” Annals of Mathematical Statistics , vol. 18 (1947), pp. 194 ff. See also 
M. S. Bartlett: “Some aspects of the time correlation problem in regard to 
tests of significance,” Journal of the Royal Statistical Society , vol. 98 (1935), 
pp. 540 ff. 



I1.2] 


VARIATE DIFFERENCE METHOD 


313 


holds. Frequently we may have to assume the existence of a linear 
scheme of autoregression (stochastic difference equation), treated in 

section 10.3. Then the variate difference method is not applicable, as 
pointed out by M. G. Kendall. 11 

If we have several variables and systems of relationships between them 
then we may use the estimates of the variances (and also of the covari¬ 
ances 1 -) for an approximation of the covariance matrix of the errors. 

The theory of the treatment of equations which have errors in the variables 
has been given in section 6.5. 

An exact test for the equality of the variances of two consecutive 
difference series will be given in section I 1.2.3. 

A transformation of the observations which is based upon the first 
differences of the original series will be discussed in section 1 1.3. 

Example 1. The following data are taken from the author’s book 
The Variate Difference Method.™ Yearly United States wheat-flour prices, 
1890-1937, were analysed. There are 48 years; N = 48. The variance 
of the original series of observations (F 0 ), the variances of the difference 
series ( V k ) computed from formula (6), and the standard error ratios 
computed from formula (II) are presented in Table 1. 


TABLE I 


* 


Order of Difference 
k 

0 
• I 

2 

3 

4 

5 


Variance 

Standard Error Ratio 


Rk 

4.7969 

5.7878* 

0.7020 

5.0514* 

0.4402 

1.9368 

0.3931 

0.8952 

0.3767 

0.6668 

0.3662 

0.8012 


Significant at the 5, I, and 0.1 per cent levels. 


thatTh ‘ he t ddta .P re | sented ln Ta ble I we would be inclined to conclude 
that the systematic element M, has probably been approximately elimin¬ 
ated in the first or second difference series. R , is significant at the 0 I 

5 per e cent e |e el ^ S ]? nificancc ’ but ** is " ot significant, not even at the 

estimat or 7 ™ y ' f ° r inStance ’ take V * = 044 02 as our 

estimate of the variance of the random element f „ namely of o* This 


M. G. Kendall: Contributions to the 
(Cambridge, 1946), pp. 47 ff. 

G Tintner: op. c/7., pp. I 17 ff. 

Ibid., pp. 40 ff. 


Study of Oscillatory 


Time Series 



3,4 THE TRANSFORMATION OF OBSERVATIONS [11,2 

is an estimate of the variance of the random element in the population 
from which our sample of 48 prices is taken. 

We may now evaluate our estimate as far as its efficiency is concerned. 
The tables given by Morse and Grubbs indicate for /V = 48 and k 0 = 2 
an efficiency of about 0.5. This is to say, the accuracy of our estimate 
of the random variance in the population from the second finite difference 
series of 48 observations is about equivalent to a similar estimate from 
a series of only 24 independent observations, which form a random sample. 

One may wonder whether a sample of 48 is large enough to permit us 
to use the large sample test indicated above. The low efficiency points 
to the great inaccuracy of the results. 

One may also doubt whether the wheat-flour prices follow actually 
the simple formula (I) and not a stochastic difference equation (section 
10.3) or linear scheme of autoregression (section 10.6.2). An analysis 
of the autocorrelation coefficients of this series, indicated in section 10.1.2, 
Example 1, shows that there is certainly this possible alternative. If 
the series follows such a stochastic scheme, then the application of the 
variate difference method is not justified. 

11.2.3 EXACT TEST 

It is somewhat doubtful whether with economic data we ever have 
enough observations to apply efficiently the large sample test presented 
in section 1 1.2.2. But it is possible to apply a test which is not completely 
efficient because it utilizes only part of the data. 14 It is, however, exact 
and based upon Fisher's variance ratio (r) test. 15 

We want again to test the approximate equality of the variance of the 
/rth difference series, V k , and the variance of the (k -f- I)st difference 
series, V k+l . But in order to make the exact test it is necessary to select 
certain items from the series of the squares of differences of the Arth 
difference series and from the series of the squares of differences in the 
(k — 1 )st difference series. These items ought to be selected in such a 
way that they are independent. Under these circumstances the conditions 
of Fisher's analysis of variance test are fulfilled. 

This method was described in an article by the author published in 
1939. 16 Tables which facilitate the application of the test are given in 


14 Ibid., pp. 73 ff. 

lj R. A. Fisher: Statistical Methods for Research Workers (10th ed., Edin¬ 
burgh, 1946), pp. 192 ff. G. Tintner: “On tests of significance in time series," 
Annals of Mathematical Statistics , vol. 10 (1939), pp. 139 ff. 

16 G. Tintner: “On tests of significance in time series," Annals of Mathematical 
Statistics , vol. 10 (1939), pp. 139 ff. 



I1.2] 


VARIATE DIFFERENCE METHOD 


315 


the authors book, The Variate Difference Method 4 Another method 
of selection was recently proposed by Johnson. 18 

We select from the series of the squares of differences in the difference 
series k the following items: jj + (2k 4 3),y + 2(2 k -f 3), • • *. From 
the squares of the differences of the difference series of order (k 4 1) 
we select these items: j + k + 1, j 4 k + 1 — (2k + 3), j 4- k + 1 4 
2(2 k + 3), • • *. This gives us about (N — k)/(2k 4 3) items of the 
squares of the Arth differences and approximately (N — k — l)/(2 k 4 3) 
items of the series of the squares of the (k 4 l)st differences. Here 
/' = 1,2,- • * 2k 4 3. Hence we have 2H3 possible selections. These 
are, however, not independent. 19 We form the sum of squares of the 
selected differences: 


(1) 




The summation X' is now extended over the selected items indicated 
above. The items are independent under our assumptions (2) to (5), 
and hence we may apply the exact variance ratio test: 


( 2 ) 


F = 


Sk +i(N _ k) 2k C k 

sfir-k- 1'w.c” 


A + l 


is distributed like Snedecor's F (variance ratio) 20 with approximately 
(/V— k)/(2k 4 3) and (N — k — l)/(2 k 4 3) degrees of freedom. 

In the author's book, The Variate Difference Method , tables are pre¬ 
sented which facilitate the application of the test. 21 Limits are given 
for the quantity: 

(3) C, = - 

The 5, 1, and 0.1 per cent levels of significance are given in tables in 
The Variate Difference Method. 

G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940), 
pp. 75 ff. 

N. L. Johnson: “Tests of significance in the variate difference method " 
Biometrika , vol. 35 (1948), pp. 206 ff. 

19 T * Haavelmo: “The variate difference method," Econometrica , vol. 9 

(1941), pp. 74 ff. G. Tintner: “The variate difference method: a reply," ibid., 

pp. 163 ff. T. C. Koopmans: “A review of G. Tintner: The Variate Difference 

Method , ’ Review of Economics and Statistics , vol. 26 (1944), pp. 105 ff. 

G. W. Snedecor: Statistical Methods (4th ed., Ames, Iowa, 1946), pp. 

218 ff. r ' r ’ 

Tintner: The Variate Difference Method (Bloomington, Ind., 1940), pp. 



316 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 


If N is large, the number of degrees of freedom for F becomes nearly 
equal. Then we may take the quantity: 

(4) r = i log, F 

as almost normally distributed with zero mean and with variance 
(2k -f 3 )I(N - k - l). 22 

This test is exact but not very efficient because it utilizes only a fraction 
of the available observations. However, we may make a number of 
tests in trying to compare the variances of two consecutive difference series. 
These tests are not independent, and their joint distribution is unknown. 
Still, we might use these results in order to form an idea about the 

probability of the equality of the variances of two consecutive difference 
series. 23 

Example 1. We use again the 48 yearly observations on American 
wheat-flour prices. The raiios of sums of squares of selected differences 
are presented in Table l. 24 

For these selections we have taken from the total squares of differences 
the following selected differences: 

For selection 0 -A the squares of the deviations from the mean of the 
following items of the original series: 1,4,7,10,- • *. From the squares 
of the first differences we have included numbers 2, 5, 8, 11,* • -. 

For selection 0 -B the sums of squares of the deviations from the mean 
of the original observations numbered 2, 5, 8, 11,- • • were taken. The 
sums of squares of the first differences numbered 3, 6, 9, 12, • • • were 
included for this selection. 

For selection 0-C we choose the sums of the squares of the deviations 
from the mean of the following items of the original series of prices: 

3, 6, 9, 12, • • -. The sums of squares of the following first differences: 

4, 7, 10, 13, • • -, were chosen for this selection. ' 

The ratios of the selected sums of squares are presented in Table 2 
[formula (3)]. The ratios S l /S 0 are tested with the help of tables presented 
in the author's book. If we choose a level of significance of 5 per cent 
the limits are 0.186 and 1.353. On a level of significance of 1 per cent 
we have the limits 0.135 and 1.856. On the level of significance of 0.1 
per cent we have the limits 0.0946 and 2.659. All our empirical ratios 


22 R. A. Fisher: Statistical Methods for Research Workers (10th ed., Edin¬ 
burgh, 1946), p. 226. 

23 G. Tintner: “Foundations of probability and statistical inference," 
Journal of the Royal Statistical Society , vol. 112 (1949), pp. 252 ff. 

24 G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940), 
pp. 93 ff. 



11.2] 


VARIATE DIFFERENCE METHOD 


317 


TABLE 1 


Selection 

Ratio of Sums of Squares 


G k 

0-A 

31.15236}: 

0 -B 

16.31728}: 

0-C 

11.6203 1% 

\-A 

\ .44941* 

1 -B 

3.43416+ 

1-C 

0.16441 

1 -D 

0.27489 

1 -E 

0.20595 

2-A 

0.26046 

2-B 

0.39901 

2-C 

0.14361 

2-D 

0.06582 

2-E 

0.45729 

2-F 

0.39138 

2-G 

0.90786 

3-A 

0.61593 

3 -B 

0.22396 

3 -C 

0.34731 

3-D 

0.05459 

3-E 

0.017351 

3 -F 

33.61659‘j 

3-G 

0.20294 

3 -H 

0.57390 

3-1 

2.37788* 

4-A 

0.19830 

4-B 

0.07381 

4-C 

0.59821 

4-D 

3.00484* 

4-E 

2.23956* 

4-F 

0.02064| 

4-C 

1.65375 

4-H 

0.35500 

4-1 

0.02900* 

4-J 

0.02167* 

4-K 

0.06184 


* Significant at the 5 per cent level of significance, 
t Significant at the 5 and 1 per cent levels of significance. 

X Significant at the 5, I, and 0.1 per cent levels of significance. 



318 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 


for the various selections 0 -A, 0-B, and 0-C fall outside all these limits. 
Hence we must reject the hypothesis that the variances of the original 
series of the series of first differences are equal. The systematic com¬ 
ponent has not been eliminated or reduced to a sufficient degree. 

Now we compare the variance of the first difference series with the 

variance of the second difference series by making various selections 

from the series of the squares of the first and second differences. Selection 

\-A is made by taking the squares of the following first differences: 

1,6, 11, 16, • • •, and the squares of these second differences- 3 8 13 
18, • • •. ’ ’ 

Selection 1 -B consists of the sum of squares of the following first 

differences: 2, 7, 12, 17, • • and the sum of squares of the following 
second differences: 4, 9, 14, 19, • • *. 

Selection l-C is made by taking the following squares of the series of 

first differences: 3, 8, 13, 18, • • *, and the squares of the series of second 
differences: 5, 10, 15, 20, • • •. 

Selection 1 -D is carried out by taking the following squares from the 
series of first differences: 4, 9, 14, 19, • • •. The sum of the squares of 
these differences is compared with the sum of squares of the following 
selected second differences: 6, II, 16, 21, • • •. 

Selection I -E implies the selection of these squares from the series of 
first differences: 5, 10, 15, 20, • • •, and the following squares of the 
series of second differences: 7, 12, 17, 22, • • •. 

The ratios of the sum of squares of selected first differences and sum of 
squares of selected second differences are again given in Tabie 2. In order 
to test them we take our limits from the table given in the author's book. 

On the 5 per cent level of significance the limits for the ratio S 2 /S l are 
0.0918 and 1.216. On the 1 per cent level the limits are 0.0610 and 1.835. 
Finally, on the 0.1 per cent level we have the following limits: 0.0381 
and 2.949. None of the ratios for I-/I, 1-5, l-C, I-£>, l-£ fall outside 
any of these limits. Hence we cannot reject the hypothesis that the 
variance of the first series of differences is approximately equal to the 
variance of the series of second finite differences. We may assume that 
we have already eliminated the systematic part of our variables by taking 
one finite difference, or at least have reduced it to a sufficient degree. 

This is confirmed by a look at the lower parts of Table I. We find 
some ratios which are significant, but the majority is clearly not significant. 
Hence we may again take the variance of the first or second difference 
series as an approximation to the variance of the random element a 2 in 
the population which corresponds to our sample (section 1 1.2.2, Example 
I). This result holds only if the fundamental assumptions which underlie 
the stochastic model of the variate difference method are fulfilled. 



11.2] 


VARIATE DIFFERENCE METHOD 


319 


11.2.4 REDUCTION OF THE RANDOM VARIABILITY 

We want sometimes to find an approximation to the systematic parts 

of our variables M t . This is possible if we know that the systematic 

part of our variable has been approximately eliminated in the difference 

/ro- Tests for this proposition have been given in sections 112 2 and 
11.2.3. 

A very ingenious method devised by O. Anderson 25 permits us to 

approximate the systematic parts of our variables by systems of moving 

averages (section 8.2). These are closely connected with the Gram 
polynomials. 26 

Let us assume that our tests have convinced us that the systematic part 

of our series has been eliminated or at least greatly reduced in the finite 

difference series of order k 0 . In order to simplify our notation we define 

n = k 0 /2 if k 0 is even and n = (k 0 -f D/2 if k 0 is odd. 

Then we consider the approximate systematic component M as a 

function of the differences of order 2 n of the original series X f . We use 

the method of least squares and derive the following system of moving 
averages: 




g„Jj)(X< 


T) 


+ *<-,) 



JO)X, 


This is an average of length 2m + I. The m is arbitrary. The larger 

it is, the smaller will be the residual variance of M . The weiehts^of 

the moving average g„,„(j)(j 0, 1, ■ ■ ■ m) are tabulated in the author's 

book, The Variate Difference Method. 21 

This procedure is equivalent to the following: We fit a polynomial of 

order n to all consecutive 2m + 1 observations by the method of least 

squares (section 8.2.1). The degree of the polynomial is determined by 

the order of the difference series k„ in which the systematic part of the 

observations has been approximately eliminated. The length of the 

moving average 2m + I will be given by the degree of accuracy which we 
desire to obtain. 

We can also compute without great difficulty the residual variance of 
M t defined in formula (I). This is the random variation remaining in 
the data after the moving average (1) has been applied. Let the estimated 


tRll 0 • AndCrSOn: ° r ' Cit ‘ G Tintner: The Variate Difference Method 
(Bloom.ngton, l n d., 1 940), pp. 100 ff. 

lnH Vo Jc\ DaV ‘ S! ™ eS ol Hi X her Mathematical Functions , vol. 2 ( Bloomington, 
ind., 1935), pp. 307 ff. 

pp 101 ff mtner: ThC VariGte Di ff erence Method (Bloomington, Ind., 1940), 



320 


THE TRANSFORMATION OF OBSERVATIONS 


[ 11.2 


random variance of the original series be V ko . Then the error variance 
of M,' is given by: 


( 2 ) 


V = V*. 


nm 


[section 8.2.3, formula (10)]. The constants L nm are again tabulated in 
the author's book, The Variate Difference Method.™ This forrm la 
should be helpful in determining the length of the moving average. 

Example 1. We want to illustrate the procedures of the variate differ¬ 
ence method by another example. We will analyze the yearly series of 
the quantity of meat consumed in the United States, 1919-41 (section 8.1, 
Example 2). The mean of the series is 166.1913, its variance 62.25174.’ 
We present the series in the first five finite differences in Table I. 


TABLE 1 

Differences of Quantity of Meat 




First 

Second 

Third 

Fourth 

Fifth 



Original 

Differ¬ 

Differ¬ 

Differ¬ 

Differ¬ 

Differ¬ 

Smoothed 

Year 

Series 

ences 

ences 

ences 

ences 

ences 

Series 

t 

Xx 

AX t 

A 2 X t 

A 3 X t 

A 4 X t 

A 5 X t 

M/ 

1919 

171.5 

-4.5 

2.0 

5.3 

- 7.3 

- 6.3 


1920 

167.0 

- 2.5 

7.3 

- 2.0 

- 13.6 

33.1 


1921 

164.5 

4.8 

5.3 

- 15.6 

19.3 

12.5 

165.121 

1922 

169.3 

10.1 

- 10.3 

3.9 

7.0 

- 22.2 

170.465 

1923 

179.4 

-0.2 

- 6.4 

10.9 

- 15.2 

17.3 

177.733 

1924 

179.2 

- 6.6 

4.5 

- 4.3 

2.1 

4.3 

178.604 

1925 

172.6 

- 2.1 

0.2 

- 2.2 

6.4 

- 12.0 

173.904 

1926 

170.5 

- 1.9 

- 2.0 

4.2 

- 5.6 

5.2 

170.320 

1927 

168.6 

- 3.9 

2.2 

- 1.4 

- 0.4 

6.1 

168.052 

1928 

164.7 

- 1.7 

0.8 

- 1.8 

5.7 

- 8.9 

165.179 

1929 

163.0 

- 0.9 

- 1.0 

3.9 

- 3.2 

- 8.0 

163.033 

1930 

162.1 

- 1.9 

2.9 

0.7 

- 11.2 

14.1 

161.612 

1931 

160.2 

1.0 

3.6 

- 10.5 

2.9 

49.5 

160.472 

1932 

161.2 

4.6 

- 6.9 

- 7.6 

52.4 - 

- 144.4 

162.159 

1933 

165.8 

- 2.3 

- 14.5 

44.8 

- 92.0 

159.5 

165.558 

1934 

163.5 

- 16.8 

30.3 

- 47.2 

67.5 

- 82.6 

159.008 

1935 

146.7 

13.5 

- 16.9 

20.3 

- 15.1 

2.0 

154.581 

1936 

160.2 

3.4 

3.4 

5.2 

- 13.1 

15.0 

158.096 

1937 

156.8 

0.0 

8.6 

- 7.9 

1.9 


158.918 

1938 

156.8 

8.6 

0.7 

- 6.0 



165.235 

1939 

165.4 

9.3 

- 5.3 





1940 

174.7 

4.0 






1941 

178.1 








28 Ibid., p. 106. 



11.2 ] 


VARIATE DIFFERENCE METHOD 


321 


The original series has N = 23 items. We see that the series of first 

differences consists of only 22 items, and with each series of higher 
differences we lose I item. 

The differences are computed in the well-known fashion. We have. 

for instance, for the first differences, 167.0- 171.5 - 4.5. The 

next item in the series of first differences is 164.5 - 167.0 = - 2.5, etc 

The second differences, A 2 y„ are simply the differences of the first 

differences. We have - 2.5 - (- 4.5) - 2; 4.8-(-2.5) = 7.3, etc. The 

third differences, A 3 4T„ are the differences of the second differences, etc. 

A check is provided in the following manner: The sum of each column 

is equal to the difference between the last and the first items of the previous 
column. 

The sums of the squares of the differences, the coefficients A., 3 , needed 
for computing the variances V,, these variances, the coefficients'//.,, N and 
the standard error ratios R k . are shown in Table 2. 

TABLE 2 


Order of 
Differ¬ 
ence k 

0 

1 

2 

3 

4 

5 


Sum of 
Squares 

905.64 
1,860.23 
5,41 1.77 
17,321.25 
58,446.06 


Difference Analysis 

Variance 
of Differ¬ 
ence V, 

62.25174 
23.58636 
17.041 12 
15.73921 
15.28964 
15.301 18 




0.02604378 
0.009160760 
0.0029083 3 
0.00088271 
0.000261820 


”23,ft- 

4.29026 
7.95862 
9.87374 
10.88103 
I 1.31964 


Standard 
Error 
Ratio R, 

2.665 
2.209 
0.830 
0.31 I 
0.009 


Prob¬ 

ability 

0.0076 

0.0270 

0.4066 

0.7166 

0.9200 


JJ e rT; ientS I-'*:' and H ' ni are taken from tables in the author's 
book. The Variate Difference Method. 

We have, for instance, from Table 2: The sum of the squares of the 
ird differences is 5,411.77. The quantity A 233 is 0.00290X33. Hence 

15*73921 ° f the ‘ hlrd dlfference series is Li = (5,411.77X0.00290833) - 

fn order to compute the standard error ratios 77 , we use the coefficients 
We have ’ for 'Stance, from formula (I 1 ) of section I I 7 7 r or a 
comparison of the variance of the series of third differences ( K 3 ) and the 
variance of the series of fourth differences ( V.) ■ /? =( 15739 ,, 

15.28964)(10.88103)/15.73921 - 0.311. (15.739.1- 

The probabilities are computed under the assumption of a normal 

ssssrir* * t ,n "" case ° f ojzzt 

approximation. 



322 


THE TRANSFORMATION OF OBSERVATIONS 


[11.2 

We conclude that the systematic component has been sufficiently 

reduced or eliminated in the third series of differences. Hence we have 

A'o = 3, and we may take the variance of the third difference series, 

3 — 15.73921, as an approximation for the variance of the random 
element. 

The test indicated above is not strictly valid since our sample is not 

large enough. Hence we make a series of new tests and use various 
selections as indicated in section 11 . 2 . 3 . 


TABLE 3 


Selection 

l-A 
1 -B 
1-C 
1 -D 
l-E 


2-A 
2-B 
2-C 
2-D 
2-E 
2-F 

2- G 

3- A 
3 -B 
3-C 
3-D 
3-E 
3-F 
3-G 


Ratio of Sums of Squares of 
Selected Differences 

6 . 1271 

0.845 

0.155 

0.143 

0.237 

3.798* 

5.372f 
1.907 
1.308 
0.061 
0.01363| 

0.1147 f 

0.01052* 

0.002268$ 

0.05304 
0.4837 
1.028 
56.109$ 

35.144$ 


3-1 

1.065 

4-A 

7.947 

4-B 

5.240 

4-C 

46.846 

4 D 

7.515 

4-E 

20.332$ 

4-F 

0.4174 

4-G 

0.4268 

Significant at the 5 

per cent level of significance. 


V/ £ - — —— v ^ ■ m m m m a a w • 

Significant at the 0.1 per cent level of significance. 



11.3] 


AUTOREGRESSIVE TRANSFORMATIONS 


323 

The evidence of the selections is not very clear. 1, seems, however 
hat we .will probably not commit too great an error if we assume the 

h y r7 r TZ e ' iminated ° r iU ' eaSt -bs,antially reduced in the 
»h rd ° r Fou ^ ddTerence series. There is, however, the possibility that 
this senes follows a scheme of linear autoregression or a stochastic 

difference equation (section 10.3). monastic 

Now we might wan, ,o reduce the random variability of the original 
averages. aP We°ha m v a, T g l h " SyS,ematic com P°nen, by a system of moving 
."y reduced m ,he ,hir d series or dl(rere „„ s ^ we 

12 j- We Wl " choose a system of moving averages which is 

Th n f a sec r d - d ?T ■» “ ^ 

observanons. The number of these observations (2m + I) ie the 

ng h of the moving average, is arbitrary. We will get a more accurate 
result if we include more items in the moving average 

a svTeTof" 6 Ch °° Se m 2 ' ThlS iS '° S3y ’ We Smooth series by 
< system of moving averages of length 2m 1=5 Thk k ^ 

to fitting by the method of least squares a second-degree parabola^ 03 "' 
consecutive five observations. The weights of the five m l 

averagejtre aga,n from tables ,n the book, The Variate Difference Method, 
w t = - 0.0857143 3 ’ " 1 °‘ 3428571 ’ - 0-4857143, w, = 0.3428571, 

From the tables given in the author's book we see that the r ,nH. 
variability is reduced in the proportion A 22 = 0 4857143 The anni™ 

F 3 = 1 “l y C Vanance ° f the Series ° f third differences as 

inserts of mov rmUla ^ ™ f ° r the residual random variance 
= 7 . 64 47 6 ° g avenges Jus' described, V = (15.73921 )(0.4857143) 

All these results should be treated with proper caution The a 

; p °,r orr p,ion r ,hecmpirk *' 

o-. taffd r^sr- 

ponem, whose ineiivldualterns have mean zero an d c “ '‘" d ° m com - 
and are not autocorrelaterl Thk 1 . ° and constanl variance, 

questioned. aUtOCOrre,ated - Thls assumption, especially, might be 

H-3 Autoregressive Transformations 

Another method of transformation may be applied if ,he 
deviations fohow a hneat s,ocha s ,,c differeL 



324 


THE TRANSFORMATION OF OBSERVATIONS 


[ 11.3 


scheme (sections 10.3 and 10.6.4). Suppose that we have the following 
relationship between p variables: b 


(!) X lt - k 0 + k t X u + ••• + k v X v , + £, (/ = 1,2, •••A') 

Let e, be a random variable with mean zero Ee t = 0 and variance a 2 : 

£ ‘ ~ ° ' We also assume that e, and e, (t ^ s ) are not autocorrelated: 
Ee,e s = 0 for all t ^ j. 


Assume that the random variable £, is generated by an 
scheme: 


autoregressive 


( 2 ) 


it = ^it-i + e, 


This is a first-order stochastic difference equation (section 10.3.1). 

Then it is advisable to replace the variables X it in equation (1) by the 
transformed variables: 


(3) 


X it -AX it _ x (/ = 1, 2, 


■ ■ p; t = 2,1, 


N) 


We obtain from (1): 


(4) 


(X u — AX lt _ 1 ) = k 0 ( \ - A) + k 2 (X 2t — AX 2( _ l ) 


+ • * • -f- k J) (X J)t AXj,^^ -f- £ f 

We note that the deviations or errors e t in (4) are now not autocorrelated. 

Hence the classical method of least squares can be applied without 

modification. The Markoff theorem applies (section 5.1). 

This method necessitates, unfortunately, knowledge of the constant A 

in the stochastic process (2). This will, however, be in general unknown 
with empirical data. 

Orcutt 1 suggests proceeding in this way: Fit relation (1) by the method 
of least squares. Then test the residuals for lack of autoregression. 
If there is autoregression, we may try to find the constant A from the 
autoregressive properties of the residuals. This constant should then 
be used in a new trial. Now fit relationship (4) by the method of least 
squares, where A has been derived from the autoregressive properties of 
the residuals from (1). We test the residuals from the fit (3) for random¬ 
ness (sections 10.1.1 and 10.1.2). If they are not random, we try to obtain 
a better estimate of A , etc. 


1 D. Cochrane and G. H. Orcutt: “Application of least squares regression 
to relationships containing autocorrelated error terms,” Journal of the American 
Statistical Association , vol. 44 (1949), pp. 32 ff. G. H. Orcutt and D. Cochrane: 
“A sampling study in the merits of autoregressive and reduced form transforma¬ 
tions in regression analysis,” ibid., vol. 44 (1949), pp. 356 ff. See also A. R. 
Prest: “Some experiments in demand analysis,” Review of Economics and 
Statistics , vol. 31 (1949), pp. 33 ff. 



11.3] 


AUTOREGRESSIVE TRANSFORMATIONS 


325 


Consider, for instance, the relationship: 


(5) 


%it k Q -f k 2 X 2t -f- Ct (t = 1,2, 


N) 


The variable follows an autoregressive scheme of form (2) We 

sLtnr ° r ^ ><«- z 

We have the means: 


( 6 ) 


y 

?. x ti 

X it = ±1 — 
(V- 1 


(/'= 1 , 2 ) 


(7) 


*«-. = 


y 

1 *,«- 

t= 2 

N- 1 


a = i, 2 ) 


Denote the deviations from the means by small letters: 


( 8 ) 

(9) 


X a — X it — x it 
Xil ~ 1 = X it _i — x it _ 1 


Then ourestimate of the regression coefficient*, in formula ( 5 ) is: 

do) *, = ~ + .|. W») + 


f = 2 


i *-"- X -- 

1 X 2 2l — 2 A 2 X2lX2t~l + A 2 2 x 2 2( _j 

f = 2 

On the basis of older work connected with the variate difference method 2 
a has been suggested by Orcutt that we may conveniently take j - 1 

™ „rrr m,y be ,r ; e wi,h “»»»■»» zit 

th « Pr^ in the nest time unit (, + n X 1| 2 “ t P " 

' P'“ s ■ change e,„ (Lila (^fi’n s ,1 'of,? “Vh 

wlTe" WWCh 11 Th, ‘ 

(! 1) &X it - - X„ (i = I, 2, • - ■ p : r = I, 2, - - - At - I) 

This is the first difference of X it . We 


the equation: 


may estimate relationship ( 1 ) from 


(12) A 4 = k 2 Ax 2i + k 3 \X 3 , + • • . + k \ x 


(' = 1, 2, • • • /V) 


G. Tintner: The Variate Difference Method (Bloomington, Ind., 1940). 




326 


THE TRANSFORMATION OF OBSERVATIONS [II j 

For the estimation of relationship (12) just given we may use the method 
of least squares without any modification. 

Orcutt’s method is very ingenious. But in the practical application 
we have the difficulty that the nature of the autoregressive scheme (2) 
must be known. This will hardly ever be the case with economic data. 
If the constant A is not known and has to be estimated, the validity of 
the results is frequently impaired. It is also doubtful whether the simple 
difference transformation (11) will always be sufficient. 

Another shortcoming of the method is the neglect of errors in the 

variables X 2t , X 3ti • • • X pt . These are errors of observations and similar 

deviations. To assume that they are absent is not justified in empirical 

work with economic series. A method of dealing with errors of this 
kind was presented in section 6.5. 

Example 1. We reproduce in Table 1 a table recently given by Stone 3 

and dealing with the demand for home-grown potatoes in Great Britain. 

A demand function has been derived by the method of least squares, 

(a) for the original data, ( b) by the use of the first difference transformation 

( 11 ). 

TABLE 1 


Income 

Elasticity 

Own All Other Residual 

Multiple 

Deter¬ 

Von 

Neumann 

Elasticity 

Price 

Prices Trend 

mination 

Ratio 

0.23 ± 0.30 

-0.60 ± 0.08 

(a) Original Data 

0.44 ± 0.19 -0.008 ± 0.009 

0.74 • 

1.45 

0.32 ± 0.29 

-0.57 ± 0.07 

(b) First Differences 

0.43 ± 0.26 -0.007 ± 0.013 

0.81 

2.02 


It appears that the estimates of the regression constants are approxi¬ 
mately identical for both the multiple regression with the original data (1) 
and the analysis which uses the first differences (12). The standard errors 
(added after the sign i) are also not much different. But the multiple 
determination (square of the multiple correlation coefficient) is somewhat 
greater if first differences are used. Hence the accuracy of the fit has 
been improved by the difference transformation. The Von Neumann 
ratio (section 10.2) indicates a definite departure from independence for 
the residuals from the regression analysis with the original data. But 
with the first differences the ratio is very near its mean value. 

The validity of the results may be questioned because a single equation 


3 R. Stone: “The analysis of market demand. An outline of methods and 
results. Review of the International Statistical Institute , vol. 16 (1948), pp. 1 ff. 



11.3] 


AUTOREGRESSIVE TRANSFORMATIONS 


327 


lioniT" C ° nS 'f red and not 3 ^tem of equations. Errors of observa- 
tions have not been taken into account. 

. | f2 Ut " s “B™ try to find a demand function for meat in the 
Uni.ed States. We use ,he 23 years ,919-11 • N _ 23w 7 3 

b„ X hTk P a ’■ a? wam IO ““ ,he of leas, sanares ”o°der ,o 

find ,he demand fnnct.on of meal. This would, for instance be justified 

bm ,he s “ ppi> - ~ 

income^' P "“ ° f * -* 


(13) 


X\t - + k 2 X 2t -f- k^X 3t 


An application of the method of least squares gives the estimate: 

/ 1 A \ 


(14) 


Xu ~ l85 - 682 - 0.966878^2, + 0.140828A',. 

«S ( 


The coefficient of determination (square of the rrmlr.'nl^ ™ i ♦ • 
coefficient) is 0 tu , . . ^ ultiple correlation 

s j p * - he multiple correlation coefficient is 0 791 and 

The standard error of the coefficient k, (- 0.966878) is 0 184 T h 
coi respondin 2 / is S 77 ? Thic C'-lo4. The 

20 degrees of freedom H s.gmficant at the 0.1 per cent level for 
coeffii t i We COncludc 'hat in all likelihood the 

sample. * ZCr ° the P°P ulation which corresponds to the 

COeffi 7nt^ (0.140828) is 0.025. 

We can compute the elasticities of the demand for meat with 
'o price and income for the means of all the variables Th 

'heiPnce elasticity i s °. 537 and of lhe income elastjdty a42 J St,mate ° f 

- 0.751 and" “o "lhe *T V ™ its ^ ^ price elaslic i'y, 

and 0.264. lm ' tS f ° r the lncome elasticity are 0.575 

We have fitted the relationship: 

<l5) X " = 185 682 - 0.9668782^2, + O.I40828r 3 

s ' ries of me “ cons “ mp ' ran «»— 



328 


THE TRANSFORMATION OF OBSERVATIONS 


[ 11.3 


This is significantly different from zero at the 1 percent level of significance 

(section 10.1.2). Another test which might be used was given in section 
10.1.4. & 

Hence we may take this figure as an estimate of A [formula (3)]. We 
want to fit the following relationship [formula (4)] : 

( 16 ) C*i, - 0.5252*,,_,) = k 2 (x 2t - 0 5252* 2 ,_,) + k 3 (x 3t - 0.5252* 3 ,_,) 

A straightforward application of the classical method of least squares 
gives the following estimate: 

(17) (*„ - 0.5252*„_,) = - 0.847555(* 2 , - 0.5252*2,.,) + 

0.136182(* 3 , — 0.5252*3,-,) 

The standard errors of the regression coefficients are 0.202731 and 
0.031623. The corresponding values of t are 4.181 and 4.306, both signifi¬ 
cant at the 1 per cent level of significance. The multiple correlation 
coefficient is 0.725. This is also significant at the 1 per cent level of 
significance. Hence we may state with some confidence that the auto¬ 
regressive transformation reveals some relationship which presumably 
exists in the population to which our sample corresponds. 

The estimates of the elasticities are computed for the means of our 
variables during the period in question. The price elasticity is — 0.471; 
the income elasticity of the demand for meat is 0.406. The 95 per cent 
confidence or fiducial limits are as follows: for the price elasticity, 

— 0.707 and — 0.235; for the income elasticity, 0.209 and 0.603. 

The autocorrelation coefficient of the deviations from the relationship 
derived from (17) is now — 0.034. This is computed on the non-circular 
definition (section 10.1.2). But if we use the circular autocorrelation 
coefficient as an approximation, we have for 22 degrees of freedom a 
permissible value of — 0.381 at the 5 per cent level of significance. Hence 
our empirical autocorrelation coefficient is not significant. This justifies 
to a certain extent the autoregressive transformation (16). 

We use the first difference transformation suggested by Orcutt in order 
to estimate a demand function for meat. 

Denote by AX 1 the first difference of the quantity of meat consumed, 
by AX 2 the first difference of the price of meat, and by AX 3 the first 
difference of disposable income. Then the estimated relationship (12) is: 

(18) AX x = - 0.797449(AA'g) + 0.151721(AT 3 ) 

The multiple correlation coefficient is 0.806. This is highly significant 
for 19 degrees of freedom. 

If we compute standard errors and divide the regression coefficients 
by them, we obtain the following values for t: 17.356 and 16.744. They 



11.3] 


AUTOREGRESSIVE TRANSFORMATIONS 


329 


are both significant. Hence it is very unlikely that the regression coeffi¬ 
cients are zero in the population which corresponds to our sample. 

•We compute again the elasticities at the means of all the variables: 
The price elasticity of the demand for meat is — 0.443, and the income 
elasticity is 0.452. 

With the help of the standard errors we can also compute the 95 per 
cent confidence or fiducial limits of the elasticities. They are as follows: 
for the price elasticity of demand, — 0.496 and — 0.390; and for the 
income elasticity, 0.620 and 0.284. 

The autocorrelation coefficient of the residuals from equation (18) is 
now — 0.318. This is not significant at the 5 per cent level of significance. 
We may have an autocorrelation coefficient as large as — 0.390 for 21 
degrees of freedom. This result justifies to a certain extent the application 
of the first difference transformation (18). 




Appendix 



A. I Elements of Matrices and Determinants 

We will here present some mathematical ideas which are necessary for 

the modern treatment of statistical problems. No proofs will be given. 

There is no need for completeness; only the fundamental ideas and 

theorems will be presented which are indispensable for the statistical 
analysis. 


A.I.I Matrices 

We will here deal only with square matrices. 1 
ouler n is an array of n rows and columns: 


A square matrix of 


( 1 ) 


A n A V1 

A 21 A 22 


^ In 


2n 


— ^nl A 


r>2 


7t 71 


Example 1. The matrix: 


( 2 ) 


4 - 1 


is a square matrix of order 2. 

A square matrix may also be abbreviated as follows: 

(3) A = [A,,\ (',/ 1,2 ,•••/,) 

A u is an element of the matrix. 

The subscript i indicates the row, the second subscript j the column. 

'A C Aitken: Determinants and Matrices (5th ed„ Edinburgh 1948) 

k' A .T"’ W ' J ' DunCan ’ and A - R Co,lar: Elementary Matrices (Cam- 
Yorf 93 « 1 6) ' R , 7 , G fT D r A |! en ( : Vathemancal Analysis for Economists ( New 
de it PP- 37 - c. C. MacDuffee: The Theory of Matrices (Ereebnisse 
Mathematik and three Grenzgebiete, vol. 2), reprint (New York 1946) 
Uvv yer: Linear Computations (New York, 1951). pp. 172 ff. 

331 




332 


APPENDIX 


[A. I 


Example 2. In the example above (2) we have evidently: 

(4) A u = 4, A lz = — 1, A 2l = 2, A 22 = 3 

A transposed matrix 2 is a matrix which has rows and columns exchanged 
It is designated by A If A' = B, we have: ° 


Example 3. The transpose of the matrix in example 1, formula (2) is 



A square matrix is a diagonal matrix, if all elements except the ones on 
the principal diagonal (A n , A 22 , ■ ■ ■ , A„„) are zero. 

Example 4. The matrix: 

“2 0 0 “ 

( 7 ) 0 3 0 

_0 0 1 _ 

is a diagonal matrix of order 3. 

The unit matrix 3 is a diagonal matrix in which the elements in the 
principal diagonal are 1. 

Example 5. The matrix: 



is a unit matrix of order 2. 

A matrix is symmetrical 4 if we have A u = A H (i 9 j = 1, 2, • • • n). 
Example 6. The matrix: 



is a symmetrical matrix of order 3. 


2 R. A. Frazer, W. J. Duncan, A. R. Collar: op. cit.> p. 3. 

3 Op. cit ., p. 3. 

4 Ibid ., p. 4. 



A./] 


ELEMENTS OF MATRICES AND DETERMINANTS 


333 


Two matrices are equal if all their elements are equal. Let A = [A i ] 
and B = [B„] (i,j = 1,2,- • • n) be two matrices of order n. The equality: 

(10) A = B 

implies: 

(11 > A a = (/, y = 1,2, • • n) 

Addition 4 of matrices is performed by adding the corresponding elements 
of each row and column: 


< 12 ) C=A + B 

means that the square matrix C of order n has the elements: 

(l3) c u = A „ + B u (/,_/= 1, 2. • • ■ n) 


Example 7. 



A matrix is subtracted from another by subtracting the elements of 
the hrst matrix from the second for all rows and columns: 

(15) d = A — B 

means: 


(16) 

Example 8. 
(17) 




° SCa,ari b > multiplying eTchlmrnt n b U y m t b he r scatr matriX “ multipUed b ? 
2 Ut C ^ 6 3 SCa ' ar ' ThCn the matrix cA has the elements C A„ (/, j = 1 , 
Example 9. 



5 Ibid., pp . io ff. 



334 


APPENDIX 


[A. I 


Two matrices are multiplied « in the following fashion 

(19) E=A-B 

means : 


( 20 ) 


n 


E'i 2 A ik Bkj 

k= 1 


( 7 > j — 1, 2, • • • n) 


thJJh ' th ele f m f m ‘ n th u Cyth C ° lumn of E is the sum °f the products of 
he /th row of A times theyth column of B. Note that the order of multi- 
plication is important. 

Example 10. 

(2)(I) + (— 1)(3) (2)(— 2) + (— 1)(— 4 ) 


( 21 ) 


'2 - r 

a 

— i 

i 

NJ 

- 3 5 _ 


.3 -4j 


(3)( 1) + (5)(3) (3)(— 2) + (5)(— 4) 


» 

- 1 


0 


18 - 26 


The square' of a matrix is the matrix multiplied by itself 


( 22 ) 

Example 11. 

(23) 


A 2 = A - A 


A = 


1 


- 2 


0 


(24) 


A 2 = 


1 


- 2 


0 


0 


0 


8 


Higher powers 8 of a matrix are defined similarly 
(25) ^3 _ ^ . ^2 


We have, e.g. 


(26) 


A 4 = A 2 - A 2 


Example 12. Let A be as in the previous example, formula (21). We 
have: 


0 


(27) A 4 = A 2 • A 2 = 


8 


I 


8 


0 


0 


- 80 


81 


6 Ibid., pp. 6 ff. 

7 Ibid. , pp. 10 ff. 

8 Ibid., pp. 10 ff. 



A./l 


ELEMENTS OF MATRICES AND DETERMINANTS 


335 


A. 1.2 LINEAR EQUATIONS; DETERMINANTS 9 

A system of linear equations in n unknowns, x l9 x , 
written in the following way: 2 * 

A n x i + A 12 x 2 -f • • • -f A ln x n = b l 


• can be 


( 1 ) 


^21*1 + ^22*2 + ' ’ • + A 2n X„ = b. 


A nl x l + A n2 X 2 + • • • + A nn x n = b„ 

Let A = [A it ] be the matrix of the coefficients. These coefficients are 
constant, as are the values b l9 b 2i • • • b n . 

A vector 10 is a column matrix, e.g.: 







We can the 
notation: 


n write the above system of linear equations (I) in matrix 


(4) 

Example 1. 

(5) 


! ' Ibid-* pp. 16 ff. 

10 Ibid., pp. 2 ff. 


A - x — b 





336 


APPENDIX 


(5) is the same as the system: 

(6 > 2 Xl + 3x 2 = II 

5*i —* 2 = 2 



The determinant 11 of the matrix A = [/*..] (/, j =1,2, 
by: 


• /?) is denoted 




Example 2. 
(5) is: 

( 8 ) 


The determinant of the above system of linear equations 



A determinant of the order 2 has the value: \A\ = A u A 22 - A l2 A 2l . 
Hence the value of our determinant is: 



IA | = (2)(— I) — (3)(5)-17 


The minor 12 corresponding to the element in a determinant is the 

determinant which results from omitting the /th row and theyth column. 
Example 3. 




0 

1 

3 


- 1 
3 
12 


The minor of A n in the determinant (10) is computed by omitting the 
first row and the first column in A : 


(II) 



11 Ibid., pp. 16 ff. 

12 Ibid. y pp. 16 ff. 




337 


A./] ELEMENTS OF MATRICES AND DETERMINANTS 


The minor of A 32 is computed by omitting the third row and the second 
column in (10): 



The principal minors are: 




definite"™' 9 * 1 m ' n ° rS ^ P ° S ‘ tiVe ’ the matrix A is said to be positive 

Example 4. For the matrix of the last example [formula (10)1 we 
have the following principal minors: ’ 


(14) 

1, 1 

0 

1 

0 

- 1 


0 

1 = 1, 

0 

1 

3 




- 1 

3 

12 


Since all principal minors are positive, the matrix A is positive definite 
The inverse of a matrix A is denoted by A ~ l . We have: 

(l5) A~ l ■ A = A ■ A- 1 = I 


where / is the unit matrix. The inverse of the matrix [A,,] is also denoted 
by superscripts as the matrix [A ,J ]. 

Example 5. Denote by C = [C.J the matrix inverse to A. Then we 
have from the above definition in our case for the matrix (8): 




Multiplying the two first matrices in (16), we have: 


07) 

'2C n + 5C 12 

3C U - 

Q2 


"1 

O' 


- 2 Qi T- 5C 22 

3C 21 - 

^22 _ 


_0 

1 


13 Ibid., pp. 30 ff. 
IJ Ibid., pp. 22 ff. 



338 


APPENDIX 


[A. I 


This gives the two systems of linear equations: 


(18) 

2C U + 5 C 12 

= I 


and 

3C n - 

Cl2 

= 0 


(19) 

2C 2 i + 5C 22 

= 0 



3C 21 - 

C 2 2 

— 1 


The solutions are C n = 1 / 11 , C 12 = 
reciprocal or inverse matrix is: 

= 3 / 17 , 

Qi 

= 7 

(20) 

C= A- 1 = 

'7,7 


7,t' 



1 

01 

— 

7,7. 


Q 2 = - 2 /17. The 


We can easily verify by matrix multiplication: 


( 21 ) 


'7,7 

3 / 1 

/17 

• 

"2 

3" 


"1 

°i 

.7,7 

— 2/ 

/17J 


5 

— 1 


o 

. 


To solve the system of linear equations 15 in matrix form 
(22) A-x = b 

we multiply both sides by A~ l We obtain for a solution 


(23) 


x = A 1 • b 


Example 6. 
formula (20): 

(24) 


We have in our special case, formula (5), with the use of 



It is easily verified that x l = 1, x 2 = 3 satisfy the original system of 
linear equations (5). 

A homogeneous system of linear equations has zeros on the right-hand 
side: 


(25) Bx = 0 

It can have solutions which are non-trivial (i.e., not all zero) only if 
the determinant of the system is zero: 

(26) |*|=0 


15 Ibid., pp. 27 ff. 



A./] 


ELEMENTS OF MATRICES AND DETERMINANTS 


339 


J her !J re an , infinite number of solutions, but these can be normalized 
by an additional condition, e.g., V + V + • • • + x* = 1; or v, = I 


Example 7. Let the homogeneous system of equations be: 


(27) 


•*1 T* 2x 2 = 0 
2x x + 4x 2 = 0 


The determinant of system (27): 


(28) 


B I = 


= 0 


thJr C Va * UCS - X } 2/Vs and x 2 — \/Vs are such that they satisfy 
he linear equations (27) and also the normalizing condition x, 2 y*- |' 

equadonst ^ ** ^ ^ f ° llowin * ^ Stem °f linear homogeneous 


(29) 


B - x = AC 


• X 


where A is a constant (scalar). We can transform it in this fashion 


(30) 


(B- AC)-x = 0 


SO " ,,i ‘ > " s onl y if following 


equation 16 holds: 
(31) 


B- AC\ = 0 


eac T h he of a ,h eS ° f ; \ Say Al * **' ' ‘ ' mUSt satisf y this equation (31). For 
each of them we have a set of solutions , * = iT 


22 > 


{■ V 21» 

« - u ) ' 

tne manner indicated above. 

Example 8. We have the system of equations: 


**"}• ’ ‘ The y ma y of co urse again be normalized in 

ited ahnvp 


(32) 


This 


(33) 


7jf i + 13x 2 = /.(4x t + lx 2 ) 

4x i + (>x 2 = A(3 Xl + 4x 2 ) 
system (32) can be transformed into form (30): 

(7 — 42).v, + (13- 1A)x 2 = 0 
(4 - 3A) Xl + (6 - 4A)x 2 =0 


16 


ibid., pp. 61 ff. 



340 


APPENDIX 


[A./ 


The determinantal equation (31) is: 


(34) 



This can be developed into the following equation: 
(35) - 10 + 15/1- 5A 2 = 0 


The solutions of the determinantal equation (34) are the values X t = 1 

and X 2 = 2. It is easily seen that the determinant (34) becomes zero if 
these values are substituted for X. 

If we substitute the first root X x = 1 into the system of equations (33), 
we have for the first set of solutions x l : 


(36) 


3-x'n -T 6 x 12 = 0 
*ii 4- 2x 12 = 0 


The values x u = — 2/v 5 and x l2 = 1/V X 5 are such that they are associ¬ 
ated with the first root X x = 1 and also satisfy the normalization condition. 

Substituting now the second root X 2 = 2 into the system of equations 
(33), we get for the second set of solutions the two equations ;c 2 : 


(37) 



The set of values x 2l = — l/v / 2 and x 22 = l/\ 2 are associated with 
the second root X 2 = 2. They also satisfy the normalization condition. 
Let a system of linear homogeneous equations be given in the form: 

(38) (A - AB)-x = 0 


For purposes of numerical computations it is sometimes desirable to 
apply a transformation which makes B into the unit matrix. 

Let B~ l = C and multiply system (38) by C. The matrix C is the 
inverse of B. Then we have: 


(39) (C A- IX)x = 0 

/ is the unit matrix. This system (39) is more amenable to numerical 
computations. 

If we have B — I (the unit matrix) in (38), then the roots of the deter¬ 
minantal equation corresponding to (39): 

(40) \C • A - fX\ = 0 
are called latent roots' 7 of the matrix C • A. 


1 Ibid., pp. 64 ff. 



A./] 


ELEMENTS OF MATRICES AND DETERMINANTS 


341 


The matrix °C S °! UU ° nS **.’ * 2 ’ ' x <- are called characteristic vectors. 

The matrix C A is positive definite if all the characteristic roots 2. 

7 „ ' 3r ® real and g reater than zero. This will especially be the case 
if al the elements of C - ^ are real and the principal minors of CA 
and the determinant of the matrix C ■ A itself are positive. 

Example 9. Let the system of equations (38) be: 


(41) 


We have the following matrices: 


(8 - 52)^ + (7 - 4 A)x 2 = 0 
(6 - 5A)jcj + (4 - 3A)x 2 = 0 


(42) 


A = 


8 


(43) 


B = 


The inverse of B is C: 


(44) 


C = B = 


We have: 


-Vi 


[ /i 


1 - 1 


(45) 


C- A = 


- 7 , 


i -1 



8 


Hence system (41) is equivalent to: 




0 - 1 


(46) 


— A.Vj — x 2 = 0 
+ (3 — X)x 2 — 0 


By making the determinant of this system (46) equal to zero we have- 
™ =0 


(3-2) 


The solutions are A, = J 7 _ ? TL a „ , 

matrix tu .. ‘ 2 2 * These are the latent roots of 

* = 1/,/s he n ° rmallzed . solutions are * u = 1/^2, ;c l2 = _ 1/V2- 

can eas^b’.'verified , 2 haYthe TheSe T ‘I* Char3Cteristic vectors. It’ 
system (41) y 3rC a ‘ S ° the solutions of the original 



342 


APPENDIX 


[A. 2 


A.2 Methods of Numerical Computation 

We present in the Appendix some methods of numerical computation 
which are necessary for the application of the methods presented in this 
book. No attempt has been made to cover the field completely. We 
have tried to select methods which are applicable to a great variety of 
problems. 

A.2.1 THE CROUT METHOD 1 FOR SOLVING SYSTEMS OF LINEAR 
EQUATIONS 

Suppose that we have a system of linear equations in n unknowns: 

0 ) 2 A ijXj = b t (/= 1 , 2 ,•••/!) 

3=1 

We arrange the quantities in this system as shown in Table 1. 


TABLE 1 



The quantities s i = A tJ -f- are added for checks. They are simply 

3=1 

the sums of the (n -f- 1) first elements in each row. 

Now we derive the new set of quantities shown in Table 2. This table 
should be written below Table 1. 



TABLE 2 

B \2 * ' ' n C 1 

&22 ' ' ‘ B 2n C 2 



B n ~\ B n o * * ' B nn c n t n 

These are computed in the following way: For the first column we 
have : 

(2) B a = A n (/ =1.2.--- n) 

1 P. D. Crout: “A short method for evaluating determinants and solving 
systems of linear equations with real or complex coefficients,” Transactions of 
the American Institute of Electrical Engineers , vol. 60 (1941). N. Bruner and 
D. H. Leavens: “Notes on the Doolittle solution,” Econometrica , vol. 15 
(1947), pp. 43 ff. P. S. Dwyer: Linear Computations (New York, 1951), pp. 
103 flf. D. B. Duncan and J. F. Kenney: Normal Equations and Related Topics 
(Ann Arbor, Mich., 1946). 





METHODS OF NUMERICAL COMPUTATION 


343 


A. 2] 

We simply copy the elements of the first column of Table 1. The 
elements in the first row are: 


(3) 


B Xj = 


li 


(4) 


c, = 


li 


(5) 


s Y " 

= 2 B u + Cj + 1 


11 


1 = 2 


i The elements in the first row of Table 2 right of the diagonal B u are 
the corresponding elements in Table 1 divided by the diagonal element 


li 


Th e last relationship (5) gives a check. The sum of the elements of 
the first row in Table 2 right of the diagonal element B u is t l — 1. 

The diagonal element in the second column fl 22 is computed as follows: 


( 6 ) 


B 22 — A 00 — B 


22 


21 ' &\2 


The elements in the second column below the diagonal are computed 
by the formula: r 


(7) 


^i2 ■'4 j2 B ] 2 Bh 


The elements in the second row right of the diagonal are computed by 


( 8 ) 


D ^2 j ^21^1 j 

a 2 ] — - 


B 


(9) 


Co = — 


22 


bo ^ 21^1 


B 


22 


( 10 ) 


, _ S 2 ^ 2 \ l \ 

/ o - ■ 


B 


22 


N 


2 ^2i c 2 ~f- 1 

;=3 

Again the last relation (10) gives a check. 

The diagonal element in the third row is computed as follows 


(II) 


^33 — ^ 33 — B 31 B l3 — Bo,B 


' 32 u 23 


02 ) 


The elements in the third column below the diagonal are 


7?,3 — A i 3 — B t j B 


13 


2 &23 



344 


APPENDIX 


The elements in the third row right of the diagonal 


are 


(13) 

B 3 i 

_ A 3j ^3lBn 

B 33 

(14) 

Cr. 

_ ^3 c iB 3 i — c 2 B 32 

c 3 

B 33 

(15) 


_ S 3 ^1^31 ^2^32 


33 


n 


— 2 &3i + { 3 + 1 

i = 4 


Again this last relation (15) gives a check. 

In general, the diagonal element B i( is computed as follows 


(16) 


*„ = A ti -ZB is B' 

8=1 


SI 


The elements in theyth column below the diagonal are: 


(17) 


B u = A o~ J lB is B sj 

8 = 1 


The elements in the /th row right of the diagonal are: 


*ij- l lB is B sj 


(18) 

Bu 

_ *=1 

Bu 



i- 1 



b i ~ 2 c,B t 

(19) 


8=1 

C I 

B ti 



1-1 



*t- 2 >,B, a 

(20) 

ti 

8=1 

Bu 


H 


[A. 2 


— 2 &is + -f- 1 

& = i + 1 

The last relation gives a check. 

The solutions appear in Table 3. This table should be written below 
Table 2. 

TABLE 3 


u 


//. 


u 


n 


n 



METHODS OF NUMERICAL COMPUTATION 


345 



We have u n = c n and v n = t n . 

( 21 ) 


( 22 ) 


U n -1 c n-l U n B n _ ln 

Vr *-1 tn-i V n^n-l.n = u n-\ 4" 1 


This last formula (22) provides a check for u n j. 

(23) 


U n-2 C n~2 U n&n-2,n U n-\^n-2 ii-l 


(24) 


V n-2 tn -2 V n^n-2,n V n-l^n-2,n-l — ll n-2 4~ 1 

In general we have: 


(25) 


n 

Ui = c, - 2 

«=‘rl 



H 

v . = I B, ,v, = u t + 1 

»-!* + 1 


The last relation (26) provides a check. 

Example 1. We want to solve the system of linear equations: 

2.v, 4 3 .y 2 — ,v 3 = 5 

( 27 ) x L — 4x 2 + 2jc 3 = — 1 

— 2,y 2 4- 3.y 3 = 5 

Table 1 becomes: 


TABLE 4 

2 3-1 5 9 

1 -4 2 - 1 - 2 

0-2 3 5 6 


Note that the elements in the last column (s,) are the sums of the first 
four elements in the given rows. We have 2 4 3—1 + 5 = 9 = s • 

1 ~ 4 + 2 - 1 = - 2 = *2 5 0 - 2 + 3 + 5 = 6 = * 3 . This procedure 
provides a check column. 

From the array in Table 4 we derive Table 5, which corresponds to 
Table 2. Table 5 should be written below Table 4. 


2.000000 

1.000000 

0.000000 


1.500000 

- 5.500000 

- 2.000000 


TABLE 5 

- 0.5000000 

- 0.4545454 
2.0909091 


2.500000 

0.6363636 

3.0000000 


4.5000000 

1.1818181 

4.0000000 


These elements are computed according to the formulae given above 

We have from (2): B n = 2, B n 1, B, x = 0. These are simply the 
elements in the first column of Table 4. 



346 


APPENDIX 


r\rrc.ivuij\ ^ 2 

The elements in the first row. right of diagonal element 2 are computed 

from formula 3): * 12 « •/, = J. 5; - V .-0.5. We have 

from formula (4): c, = -/, = 2.5. Finally from formula (5): t, = »/, = 

f, 5 “ (L5 °- 5 + 2 ' 5 > + L This provides a check for the elements in 

the hrst row. 

In order to compute the diagonal element in the second column we 

use formula (6) : B 22 = - 4 - (l)(1.5) = - 5.5. The remaining element 
in the second column is computed by formula (7): B 32 = — 2 — (0)(1.5) = 

- 2 'J he ?"!l ning elements in the second row follow from formulae 
P’ (9) ’ fi 23 - [2 - (1)(- 0.5)]/(— 5.5) = - 0.4545454; c 2 = 

1.1818,8, _ (- 0.4545454 + 0.6363636,% ,! The las, fo 1„ ! v7s 
a check. 

The only remaining element in the third column is the diagonal element 
which is computed by formula (11): -&« = 3 — (0)(—0 5) — f—2} 

(~ °- 45 f 4 5454) = 2.0909091. The other elements in row 3 are computed 
from formulae (14) and (15). They are c 3 = [5 - (0)(2.5) - (- 2) 

fo90909\ )] -\ 09Q \ mi , = a ' 3 = h f6 T (0)(4 ' 5 ) - (~ 2X1.1818181)]/ 
z.uyuyuyI — *4 — 3 -f- 1. Again this gives a check. 

The solution is given in Table 6, which takes the place of Table 3 It 
should be written below Table 5. 


TABLE 6 

1.000000 2.000000 3.000000 

2.000000 3.000000 4.000000 


We have u 3 = 3 and v 3 = 4 = 3 + I from the last two elements in the 
last row of Table 5. From formulae (21) and (22) we have u 2 = 0.6363636 

— (— 0.4545454)(3) = 2; v 2 = 1.1818181 — (— 0.4545454)(4) = 3 = 2 + I. 
This checks the result u 2 . 

The other solutions are computed from formulae (23) and (24): w, = 
2.5 - (- 0.5)(3) - (I.5)(2) = 1 ; v A = 4.5 — (- 0.5)(4) - (1.5)(3) = 2 = 

1 -F 1. The last result checks u l . 

It follows that our solutions are x l = \, x 2 = 2, x 3 = 3. These results 
can easily be checked by substituting into the original system of equations 

Similar methods can also be used for the transformation of sets of 

variables. Let us assume, for instance, that we have the system of 
equations: 


( 28 ) 


2 A u x > 
j= i 




n 

= 2 b„y, 

J=1 



347 


A * 2 1 METHODS OF NUMERICAL COMPUTATION 


If we want to express the quantities ,v, in terms of the*, we proceed in 
the same way as above. The only difference is that we have now n 

columns b u , b 2it • • • b nit instead of one column, b iy in Table I. The 
check column and the solutions are computed as before. 

Example 2. Our system of equations is: 

2x 1 -f 3x 2 — x 3 = y 1 -f 4 y 2 — y z 

x i - 4*2 + 2x 3 = — y x -f- y 2 -i- 3y 3 
— 2 x 2 -f 3 .v 3 = y x -f y 2 

Table 1 becomes now: 


TABLE 7 


2.000000 3.000000 -1.000000 1.000000 4.000000 

1000000 - 4.000000 2.000000 -1.000000 1 000000 

0.000000 - 2.000000 3.000000 1.000000 1.000000 


- 1.000000 8.000000 
3.000000 2.000000 

0.000000 3.000000 


The last column is the sum of the first six items in each row. This 
provides the check column. 

Table 2 is now: 


TABLE 8 


2.000000 

1.000000 

0.000000 


1.500000 -0.500000 0.500000 

- 5.5000000 - 0.4545454 0.272727 

- 2.000000 2.0909091 0.739130 


2.000000 - 0.500000 4.000000 

0.181818 - 0.636364 0.363636 

0.652174 - 0.608697 1.782608 


The last column provides a check. It is equal to the sum of the elements 
in each row which are to the right of the diagonal, plus 1. 

Table 3, which gives the solution, is now: 


- 0.043477 
1.608696 
0.565217 
3.130436 


TABLE 9 

0.608695 
0.478261 
- 0.913044 
1.173912 


0.739130 
0.652174 
- 0.608697 
1.782608 


The last row provides a check. It is 1 more than the sum of the items 

in each column which are above. The matrix in Table 9 is the transpose 
of the matrix of the solutions. K 

The required solution of our system of equations is: 


x i = — 0.043477^ + 1.608696y 2 + 0.56521 7 r 3 
*2 = 0.608695 ( l + 0.478261 y 2 - 0.913044r 3 

x 3 = 0.739130c! + 0.652l74)- 2 - 0.608697>' 3 


( 30 ) 



348 


APPENDIX 


[A. 2 


A.2.2 COMPUTATION OF DETERMINANTS 

The Crout method may also be used for computing determinants. 
The determinant of the matrix: 


A IX 

TABLE 1 

/I • • • 

^*12 

A\ n ~ 

A 21 
• • 

^22 

• • ■ 

• • • • • 

^2 n 
• • 

- ^nl 

A n 2 

• • • 

Ann - 


is simply the product of the diagonal elements in the array given in 
Table 2 of the last section (A.2.1): 

(0 D n = ’ * * B nn 

Example 1. We consider the matrix from the last section (Table 4 
of A.2.1): 

TABLE 2 


-2 

3 

- r 

1 

- 4 

2 

.0 

- 2 

3. 


We have the diagonal elements from the matrix in Table 5 of the last 
section (A.2.1): 

(2) D 3 = B u B 22 B 33 = (2)(— 5.5X2.0909091) = - 23 


A.2.3 INVERSION OF MATRICES 2 

The Crout method may also be used to compute the inverse of a 
matrix. Let the matrix be: 

TABLE 1 




^ 12 

A in 


A 2 1 
• • 

if • 

^22 

A2n 
• 00 


-A„ i 

S*n2 

A nn 

The inverse matrix is: 




TABLE 2 



<*n 

<*12 * * ' 

('in' 


<21 
• • 

✓ • • • • 
t 22 

C 2 n 

• • 


n2 * ^ nn ^ 

2 R. A. Fisher: Statistical Methods for Research Workers (10th ed., Edin¬ 
burgh, 1946), pp. 158 ff., section 29. P. S. Dwyer: op. cit ., pp. 185 ff. 






METHODS OF NUMERICAL COMPUTATION 


349 


A. 2] 

This matrix is computed from n systems of linear equations: 

c n A n 4 c 2l A ir 4* • # + c nl A ln = 1 
C 11 A 21 4 C 21 A 22 -f * • ' -f C nl Ao n = 0 


C l\ A n\ + ^21A22 4- * • • 4- C nl A nn = 0 

This gives the first column of the inverse matrix: c n , c 2l , • • • c nl . 

The next column is computed from a system like (1), except that there 

is now a 1 on the right-hand side of the second equation and zero every¬ 
where else: 


( 2 ) 


C 21 A 11 
C 21 A 21 


4 c 22 A 12 4 * * * 4 c 2n A ln = 0 
+ C 22 A 22 4 • • • 4 C 2n A 2n = I 


c 2 i A m 4 c 22 A n2 4 * • * 4 c 2n A nn = 0 


We compute the successive columns of the inverse matrix in the same 

way. For the elements c u we have a system where there is I on the 

right-hand side of the /'th equation but zero on the right-hand side of all 
other equations. 

In order to use the Crout method for a quick computation of the 

inverse matrix we arrange our elements as in Table 3, analogous to 
Table 1 in section A.2.1. 


TABLE 3 


ii 

21 


nl 


A 12 
A 22 


In 
2 n 


1 

0 


0 


0 

0 


*1 


n 2 


nn 


o 


0 


1 


n 


t s ould be noted that in this array the (n 4 I) element of the first 
row ,s I the (n + 2) element of the second row is 1, • • • the 2 n element 
o e /7th row is 1. All the other elements in these columns are zeros. 
I he sums S- are introduced for the purpose of checking. They are again 
the sums of all the first In elements in the /'th row. 

Another matrix similar to the matrix in Table 2 in section A 2 1 is 
introduced: ' ’ 


TABLE 4 


Bn 

B 2l 

• • • 

&12 

&22 

• • • • 

• • • 

• • • 

• • • • • 

By n 
B 2n 
• • • • 

C n 
C i2 
• • • • 

C t i 

*42 

§ a 

Bnx 

B„ 2 

• • • 

B„n 

Cm 

• • • 

C T , 2 








350 


APPENDIX 


r-u f U'lU/A 2 

The elements B tj are computed in the same way as described in section 
1 ' The C « are computed in the same way as the elements c, [formula 

(19) of section A.2.1]: 

/TX - ~ 2 

(3) C„ = 




B u 


Vij is here the appropriate element of the unit matrix to the right of 
the matrix [A„] in Table 3. 


are computed like 

the /, 

in section A.2.1 

check. The solutions are 

given in Table 5 


TABLE 5 



^12 

• • • 

u ln 

• • • 

6^22 

* • • • « 

• • • 

• m • > 

u 2n 

A a 

Urn 

Urn 

• • • 

• • • 

U nn 


v% 

• • • 

y„ 


The quantities t/„ are computed by the same method as the z/ ( in the 
previous section A.2.1 [formula (25)]. 


(4) 


Uii — Cji 


n 


- 2 U is B j9 

•=i+1 

The V i are computed in the same fashion as the v t - in section A.2.1 
[formula (26)]. They, give a check. 

Example 1. The matrix to be inverted is given in Table 6. 






TABLE 6 








2 

3 - 

r 







1 

- 4 

2 







.0 

- 2 

3. 




Table 3 becomes: 











TABLE 7 





2.000000 

3.000000 

- 1.000000 

1.000000 

0.000000 

0.000000 

5.000000 

1.000000 

- 4.000000 

2.000000 

0.000000 

1.000000 

0.000000 

0.000000 

0.000000 

- 2.000000 

3.000000 

0.000000 

0.000000 

1.000000 

2.000000 


From this we derive Table 4: 


TABLE 8 

2.000000 1.500000 - 0.5000000 0.500000 0.0000000 0.0000000 2.5000000 

1.000000 - 5.500000 - 0.4545454 0.0909091 - 0.1818182 0.0000000 0.4545454 

0.000000 - 2.000000 2.0909091 0.0869565 - 0.1739130 0.4782608 1.3913042 




A .2] 


METHODS OF NUMERICAL COMPUTATION 


351 


From the table we have the solutions (Table 5): 


0.3478260 
0.3043479 
- 0.0869564 
1.5652183 


TABLE 9 

0.1304348 
- 0.2608696 
0.2173912 
1.0869563 


0.0869565 

-0.1739130 

0.4782608 

1.3913042 


This is the transpose of the inverse matrix (a transposed matrix is a 
matrix in which the rows and columns have been exchanged). The last 
row provides the check. Suppose we want to solve the system of 
equations: 

2x 1 -J- 3x 2 — x 3 = 5 

(5) x x — 4x 2 -j- 2x 3 = — 1 

— 2x 2 -f- 3x 3 = 5 

The solution is: 

= (0.3478260)(5) + (0.3043479)(- 1) - (0.0869564)(5) = 1 

(6) x 2 = (0. 1 304348)(5) - (0.2608696)(- 1) + (0.2173912)(5) = 2 
* 3 = (0.0869565)(5) - (0.1739130)(— 1) + (0.4782608)(5) = 3 

This solution is of course identical with the solution given above (section 
A.2.1, Table 6). 

A.2.4 POWERS OF A MATRIX 3 

Suppose that we want to solve the system of linear homogeneous 
equations [see formula (39), section A. 1.2]: 

0) (A — }J)x = 0 

We assume that all latent roots are real and no two roots are equal; 


3 R. A. Frazer, W. J. Duncan, and A. R. Collar: Elementary Matrices 
(Cambridge, 1946), pp. 133 ff. H. Hotelling: “Simplified computation of 
principal components,'’ Psychometrika , vol. 1 (1936), pp. 27 ff.; “Some new 
methods in matrix calculation,” Annals of Mathematical Statistics , vol. 14 
(1943), pp. I ff. A. C. Aitkcn: “Studies in practical mathematics. II: The 
evaluation of the latent roots and latent vectors of a matrix,” Proceedings of the 
Royal Society of Edinburgh, vol. 57 (1937), pp. 269 ff. P. S. Dwyer: “A matrix 
presentation of least squares and correlation theory with matrix justification of 

improved methods of solution,” Annals of Mathematical Statistics, vol. 15 
(1944), pp. 82 ff. 



352 


APPENDIX 


[A. 2 


otherwise we use the methods of Aitken. 4 The solution can be most easily 
accomplished by raising the matrix A to a high power. Then we have: 

( 2 ) (a»-IA p )x = 0 


To solve this system, we try the trial values x, (1 > = I * a> = 
x„ a) =l. We form: ’ 2 



( 3 ) 4P X a> = X (Z)' 

Then we normalize the values x,.< 2 >' by dividing by x/ 2 >', which is the 
largest of the x, (2)/ . This gives x,< 2 > = xW'/xJ 2 *'. 

We try now: 

(4) a v • x< 2 > = .v ( 3) 

This process is repeated until the trial values at/*" 1 ) and *,<*> agree in 
enough decimal places. These are the solutions of the system. We have 
also A 1 P = x/ k) . This is the pih power of the largest root. 

For a check we compute the product: 

(5) A • *<*> = Xl ' 

We divide x x ’ by its largest element and should have x x = x ik \ at 

least approximately. This procedure provides a check. The largest 

latent root is the largest element of x x '. By normalizing the solutions 

x i > have a set of normalized solutions of equation (1) associated with 
the largest latent root A 1# 

Example 1. Suppose that our homogeneous system of linear equations 
is: 

-j- X2 — AaTj 

(6) -*1 ■)“ 3 x 2 + x 3 = Xx 2 


x 9 -h 4xo = Ax. 


The matrix A whose latent roots we must find is: 



6 5 5 


4 A. C. Aitken, op. cit. G. Tintner: “The simple theory of business fluc¬ 
tuations: an empirical verification," Review of Economic Statistics , vol. 26 
(1944), pp. 151 ff. 



A .2] 


METHODS OF NUMERICAL COMPUTATION 


353 


The last row is the check row. It is simply the sum of all the elements in 
each column. 

The square of the matrix is [Appendix A. 1.1, formula (4)]: 


“26 

8 

1“ 

00 

II 

CM 

X 

§ 

11 

7 

L. 

7 

17_ 

35 

26 

25 

T he square of A 2 is: 



“741 

303 

99“ 

(9) A 4 = 303 

234 

204 

_ 99 

204 

339_ 

1143 

741 

642 

The square of A 4 is: 



“ 650,691 

315,621 

168,732“ 

(10) A 8 = 315,621 

188,181 

146,889 

168,732 

146,889 

166,338_ 

1,135,044 

650,691 

481,959 


The last rows of these matrices serve as checks. They are the sums 
of the elements in the columns above. 

We multiply the matrix (10) by the vector x (1) = { 1 1 1}. 

The result is: 

(II) * (2)/ = {1,1 35,044 650,691 481,959} 


( 12 ) 


We divide the result through the largest element 1,135,044 and we have 
)\ v<2> = Il.OOOOOO 0.573274 0.424617} 


0.573274 


We multiply now matrix A 8 [formula (10)] by x (2) and have: 
(13) x (3)/ = {903,274.789 485,871.841 323,569.507} 


354 


APPENDIX 



We repeat the procedure until the results agree to three decimals: 


(14) 

x (3 > = (1.000000 

0.537900 0.358218} 

(15) 

x<»' = (880,906.375 

469,461.844 307,328.859} 

(16) 

* (4) = (1.000000 

0.532930 0.348878} 

(17) 

x< 5 >' = (877,761.782 

467,154.641 305,045.224} 

(18) 

* (5) = (1.0000000 

0.532211 0.347526} 

(19) 

*<6>' = (877,306.725 

466,820.745 304,714.721} 

(20) 

*< 6 ’ = (1.0000000 

0.532107 0.347330} 


Since x (5) and -* (6) agree in three decimals, we may take x (6) as an 

approximation to the characteristic vector. We also have = 877,307. 

To check again, we multiply *< 6 > with the original matrix A. We get 
from formula (5): 


( 2I ) x i = {5.532107 2.943651 1.921427} 

Dividing this through the largest element, 5.532107, we have the first 
characteristic vector: 


(22) Xl = {1.0000000 0:532103 0.347323} 

This agrees almost completely with our former approximation x (6) 
[formula (20)]. We also have for the first latent root = 5.532107. 
This is the largest element in vector (21). 

We have: 



~5 1 0“ 


1.0000000 


“ 1 . 000000 " 

(23) 

1 3 1 


0.532103 

- 5.532107 

0.532103 


_0 1 4 


_0.347323 _ 


_0.347323_ 


The second characteristic vector is computed as follows: Let matrix A 
be symmetrical. Let be the largest latent root and x 1 the associated 



A - 2 1 METHODS OF NUMERICAL COMPUTATION 


355 


first characteristic vector. Then we form the sum of the squares of the 
elements of 4 e 


(24) 

k i - x2 n + * 2 ,2 + • • • + .v 2 ln 

From this we form a new matrix: 


•* 2 11 - v n • x vi • • • X n • X x n 

• 

(25) B = ^ 

k \ 

* n ' *»* * 2 12 ' ■ ■ ,V 12 • ,V ln 

The matrix: 

— ' V ll * X ln X \2 * X\„ * * ‘ x 2 ln 

(26) 

D — A — B 


ha The e ,Sr '“T r ° 0t 4 and ' he aSSOdated characteristic vector v 2 
The third largest latent root of A and the associated vector v, and hieher 

ridf ° ,h " Cha,aC,enS '» ^ analogous 

z same ma " i ’ ‘ s —I* 


(27) 


A = 


0 


0 


4j 


The largest latent root was 2, == S ti, 

characteristic vector was (22): ^-532107. The associated first 

(28) x i = { 1.000000 0.532103 0 . 347323 } 

characteristic vector SeC °" d ldrgest latent root ?. 2 and the associated 
is ( 0 2 4 ; ] qUares2 ° f the dements ^e first characteristic vector , t 

( 29 ) k Y = ( 1 . 000000) 2 


+ (0.532103)2 + (0.347323)2 = 1.403767 




356 


APPENDIX 


[A. 2 


Matrix B (25) is 


(30) B = 


5.532107 

1.403767* 

(1.000000) 2 


(1.000000)(0.532103) (1.000000)(0.347323) 


(1.000000)(0.532103) 


(0.532103) 2 


_(1.000000)(0.347323) (0.532103)(0.347323) 


(0.532103)(0.347323) 

(0.347323) 2 


3.940901 


2.096965 


2.096965 


1.115815 


1.368766 


0.728322 


1.368766 0.728322 0.475403J 

We subtract B from A and have matrix D (26): 


(31) D = A - B = 


1.059099 — 1.096965 - 1.368766 


- 1.096965 


- 1.368766 


1.884185 


0.271678 


0.271678 


3.524597 


Now we treat this new matrix D in the same fashion as we treated 
matrix A before. Its largest latent root is ?. 2 = 4.347261, and the associa¬ 
ted second characteristic vector is: 


(32) jc 2 = {- 0.532153 0.347264 1.000000J 


We have also with our original matrix A , to a high degree of accuracy: 



“5 1 0“ 


“ - 0.532153“ 

1 

“ - 0.532153“ 

(33) 

1 3 1 

• 

0.347264 

=- 4.347261 

0.347264 


_0 1 4_ 


_ 1.000000_ 

* 

1.000000_ 


This shows that * s indeed the second largest latent root of A , and 

x 2 is the associated characteristic vector. 

If we want to find the third latent root of A and the associated character¬ 
istic vector, we form the sum of squares of the elements of x 2 , form a 
matrix analogous to B [formula (25)], and deduct it from D [formula (26)]. 
This matrix has the largest latent root and the associated set of solutions 



A.2] 


METHODS OF NUMERICAL COMPUTATION 


357 


Sometimes we want the smallest -J latent root of A and the corresponding 
solutions. Then, if the matrix is positive definite, we may proceed as 
follows: We form the sum of the diagonal elements of matrix A: 

(34) S = A ,| f A 22 -f * • • -f A nn 

Then we form a new matrix M by deducting Sfrom the diagonal elements 

of A. Hence we have now 4,, — S in the diagonal of M. All other 

elements of M are the same as A. Let ?. s be the largest latent root of M. 
Then we have: 

< 35 ) = A s + S 

lf X n is the smallest latent root of A. But the vector x n computed from 

M with the root X s is identical with the vector which corresponds to the 
smallest root X n of A. 

Example 3. We want to compute the smallest latent root of the matrix 
in the last example (7): 

“5 I 0" 

< 36 ) A = 1 3 I 

|_0 I 4_ 

The sum of the diagonal elements is (34): S = 5 + 3 | 4 ^ \2 
Deducting .S from the diagonal elements of /t, we have matrix M: 

~-i i o- 

( 37 ) M = | _9 | 

0 

The square of M is: 

~ 50 

( 38 ) M 2 — -16 

I 1 



reere^iJn ’ r appllca ‘°, n of the vanatc dill'erence method to multiple 

AW/ , Ecotiometrua, vol. 12 (1944), pp. 107 IT. J. H. Smith: Statistical 
Deflation in the Analysis of Economic Time Series (Chicago, 1941) pp 77 fr 



358 


APPENDIX 


[A. 2 


The fourth power of the matrix is: 




2757 

— 2145 

387 

(39) 

Af 4 = 

- 2145 

7434 

- 2532 



387 

— 2532 

4515 


By the iteration methods described above we compute the largest 
latent root and the characteristic vector. They are X 8 = — 9.879316, and 
x 3 = { 0.347324 1.000000— 0.532007}. Hence the smallest latent 

root of matrix A is 2 3 = 12— 9.879316 = 2.120684, and the vector 
belonging to this root is x 3 . 



Index of Names 



Adams, A. A., 143 

Aitken, A. C., 84, 279, 331, 351, 352 

Alexander, S. S., 73 

Allais, M., 4 

Allen, E. D., 76 

Allen, R. G. D., 3, 53, 62, 76, 331 
Alt, F., 269, 271 
Altschul, E., 73 

Anderson, O., 86, 310, 311, 319 
Anderson, R. L„ 94, 192, 193, 242, 243 
Anderson, T. W., 122, 127, 128, 156, 
166, 173, 243, 253 
Andrews, W. H., 57, 124, 142, 155 
Anscombe, F. J., 10 
Antonelli, E., 9 
Arley, N., 278 
Aron, R., 9 
Arrow, K. A., 22, 23 
Aujac, H., 73 

Babington-Smith, B., 102 
Bancroft, T., 21 
Barnard, G. A., 16 
Barone, E., 78 

Bartlett, M. S„ 32, 93, 123, 128, 134, 
205, 247, 256, 262, 279, 284, 295, 

Baumol, W. J., 8, 70, 73 
Bean, L. H., 38 
Belz, M. H., 270 
Benzel, R., 275 
Beretoni, J. N., 310 
Bergson, A., 77 
Berncrt, E. H., 107 
Bertalanffy, L. von, 208 
Beveridge, W. H., 288 
Bienstock, G., 78 
Bishop, R. L., 60 

Bohm-Bawerk, E. von, 7 
Boiteux, M., 9 
Boschan, P., 79 
Bose, R. C., 98 
Boulding, K. E., 5, 22, 46 
Bousquet, G. H., 9 

Bowley, A. L„ 12, 62, 109, 152 


Brady, D. S., 62 

Bronfenbrenner, M„ 52, 57, 142 

Brown, G. W., 96, 110 

Brownlee, O. H., 56, 76, 139 

Bruner, N., 342 

Brunt, D., 124, 217 

Burke, A., 60 

Burns, A. F., 12, 200, 216 

Carlson, S., 53 
Carnap, R., 16, 187 
Carter, C. F., Ill 
Chait, B., 4, 73 
Chamberlin, E., 50 
Champernowne, D. G., 253, 280 
Chapman, A. L., 146 
Clark, C., 13, 69 
Clark, J. M., 67 
Cobb, C. W., 51, 52, 142 
Cochran, W. G., 10, 90 
Cochrane, D., 15, 280, 282, 324 
Collar, A. R., 331, 332, 333, 334, 335, 
336, 337, 338, 339, 340, 351 
Commons, J. R., 13 
Cooper, G., 156 
Cournot, A., 11 
Cox, G. M., 10 

Cramer, H„ 18, 19, 20; 83, 86, 87, 279 
Crout, P. D., 342 

Daly, P., 52, 142 
Daniels, H. E., 212 
Dantzig, G. B., 51, 110 
Darmois, G., 18 
Daven, C., 37 
David, F. N., 18, 27, 83 
Davis, H. T., 3, 42, 95, 124, 134, 186, 
188, 191, 205, 209, 210, 213, 217, 
220, 222, 223, 224, 268, 270, 279 
284, 319 
Dean, J., 47, 48 
Demaria, G., 4 
Deming, W. E., 15, 123 
Derksen, J. B. D., 79 
Dewey, D. J., 77 


359 


360 


INDEX OF NAMES 


Diamanda, P. H., 295 
Divisia, F., 4, 55 
Dixon, W. J., 243 
Dobb, M. H., 78 
Domar, E. *D., 13 
Donner, O., 227 
Doob, L., 279 

Douglas, P. H., 51, 52, 134, 142, 152 
Duncan, D. B., 342 

Duncan, W. J., 331, 332, 333, 334, 335, 
336, 337, 338, 339, 340, 351 
Dunlop, J. T., 144 
Durand, D., 57, 98 
Durbin, J., 79, 250 

Dwyer, P. S., 1 10, 331, 342, 348, 351 

Ellis, H. S., 3, 77, 120, 144, 270, 275 
Evans, G. C., 42, 74, 270 
Ezekiel, M., 3, 24, 34, 49, 83, 93 

Feiler, A., 78 

Feller, W., 185, 186, 240, 270, 278, 279 
Fellner, W., 6, 144 
Finetti, B. de, 16 
Fisher, I., 58, 109, 267 
Fisher, R. A., 10, 15, 86, 96, 107, 116, 
129, 192, 223; 248, 312, 314, 316, 
348 

Flood, M. M., 268 
Fort, D. M., 144 
Fossati, E., 17 

Frazer, R. A., 331, 332, 333, 334, 335, 
336, 337, 338, 339, 340, 351 
Frechet, R. M., 87, 279 
French, B. L., 177 
Friedman, M., 37, 57, 59 
Frisch, R., 3, 33, 37, 58, 69, 107, 109, 
122, 124, 155, 270, 301 

Galbraith, J. K., 77 

Geary, R. C., 32, 105, 123, 127, 128 

Georgescu-Roegen, N., 64, 66 

Ghurie, G., 273 

Gilboy, E. W., 37, 67 

Gini, C., 123 

Girshick, M. A., 39, 103, 105, 107, 173 

Good, I. J., 22 

Goodwin, R. M., 69, 73 

Graaf, J. de V., 9 

Greenstein, B., 224 

Grenander, U., 279 

Grubbs, F. E., 3 1 2 

Guilbaud, C. Th., 5, 79 

Gumbel, E. J., 20 

Gunn, G. T., 52, 142 


Haavelmo, T., 10, 15, 33, 39, 66, 121, 
124, 155, 156, 166, 173, 315 
Haberler, G. von, 102, 109, 120, 121, 
216 

Hagood, M. J., 107 

Hald, A., 198 

Haley, B. F., 144, 152 

Haman, A., 35 

Handsaker, M. L., 52 

Hansen, A. H., 69, 73, 77, 121 

Harman, H. H., 102 

Harris, S. E., 6, 7, 77, 143, 152 

Harrod, R. F., 73 

Hart, A. G., 13, 50 

Hart, B. I., 252 

Hayek, F. A. von, 76, 77, 78, 110, 121 
Heady, E. O., 53 
Henderson, A. M., 77, 78 
Hickman, W. B., 144 
Hicks, J. R., 7, 14, 40, 63, 73, 143, 144, 
152 

Higg, H., 37, 63 
Higgins, B. H., 77 
Hildreth, C. G., 5, 53 
Hoel, P. G., 96 
Holme, H., 270 
Holzinger, K. J., 102 
Homan, P. T., 12 

Hotelling, H., 97, 99, 102, 110, 114, 
117, 119, 121, 209, 301, 351 
Houseman, E. E., 192, 193 
Houthakker, H. S., 60 
Hsu, P. L., 107, 116, 127 
Hurwicz, L., 5, 122, 154, 156, 173, 255, 
258, 265, 275 

Illy, L., 14 
Irwin, J. O., 89 
Isaacson, S. L., 20 

James, R. W., 270 
James, S. F., 62. 247 
Jeffreys, H., 16 
Jessen, R. J., 15 
Jevons, W. S., 37 
Johnson, D. G., 44 
Johnson, N. L., 3 15 
Jones, H. E., 267 
Jordan, C., 207 

Kahn, R. F., 67 
Kaldor, N., 55 
Kalecki, M.. 107, 270 
Karhunen, K., 279 
Kaufmann, F., 4, 5, 7, 14 



INDEX OF NAMES 


Kaysen, C., 5 
Kempski, J. von, 9 
Kendall, D. G., 279 
Kendall, M. G., 3, 16. 18. 19, 20, 25, 
83, 93, 102, 109, 1 14, 199, 212, 213, 
218, 220, 222, 235, 260, 261, 262, 
312, 313 

Kenney, J. F., 342 

Keynes, J. M., 7, 16, 67, 70, 143, 144, 
145 

King, A. J., 15 
King, G., 37 

Klein, L. R., 4, 5, 7, 33, 55, 71, 107 
109, 123, 135, 143, 155, 173 
Knight, F. H., 23, 60, 62 
KolmogorofT, A., 16 
Kondratieff. N. D., 189, 216 
Koopmans, T. C., 10, 12, 16, 33, 34, 50, 
51, 64, 66, 1 10, 123, 124, 125, 129, 
131, 155, 166, 173, 240, 243, 255. 
268, 279, 315 
Kuhn, H. W., 1 10 
Kuznets, S., 12, 209, 227 

Lange, O., 7, 8, 14, 47, 51, 77, 78, 107, 
143 

Lawley, D. N., 102 
Leavens, D. H., 342 
Lehfeld, R. A., 37 
Leipnik, R. B., 155, 173, 243 
Leontief, W. W., 3, 5. 6, 37, 55, 63, 65, 
109, 143, 144, 152 
Lerner, A. P., 77 
Lindahl, E., 275 
Lippincott, B., 77 
Little, I. D. M., 8, 9 
Livingston, S. M., 33 
Lomax, K. S., 133 
Lotka, A. J., 208 
Lundberg, E., 275 
Lundberg, O.. 279 
Lutfalla, G., 7, 1 1 
Lutz, F. A., 5 
Lutz, V., 5 

Macaulay, F. W., 198 
MacDufTee, C. C., 33 1 
Machlup, F.. 67 
Madow, G., 93. 243 
Mahalanobis, P. C., 98 
Mahr, A., 14 
Malmquist, S.. 40 

Mann, H. B.. 10. 124, 212, 265. 267 
Marczewski. j., 79 I 

Markoff, A. A., 27 


361 

Marschak, J., 5, 10, 16, 57, 60, 78, 124, 
142, 144, 152, 155, 173 
Marshall, A., 9, 11 
Marx, K., 7 
Mason, E. S., 77 
May. K., 4, 55, 109 
Meade, J. E., 7, 77, 79 
Means, G. C., 63 
Mendershausen, H., 57, 227 
Menger, K., 50, 59 
Mills, F. C., 12 
Mirk in, V. L., 63 
Mises, L. von, 12, 76 
Mises, R. von, 16 
Mitchell, W. C., 12, 200, 216 
Modigliani, 7, 144 

Mood, A. M., 18, 19, 21, 24, 25, 107, 

117, 240 

Moore, G. H., 234, 235 
Moore, H. L., 37 

Moran, A. P. A., 247, 248, 250, 257 

Morgan, J. N., 60 

Morgenstern, O., 5, 60, 124 

Morse, A. P., 312 

Mosak, J. L., 33, 40, 143 

Moulton, F. R., 74 

Moyal, J. E„ 279 

Mudgett, B. D., 5, 33, 55 

Myint, H., 8, 59 

Myrdahl, G., 296 

Nagel, N., 16 

Nanda, D. N., 117 

Nash. J. F., 5 

Nataf, A., 5, 55, 109 

Neumann, J. von. 5, 60, 1 10, 252 

Neyman, J., 16, 18, 19, 20, 27, 83, 279 

Nichols, W. H., 53 

Nordin, i. A., 49, 60 

Ogawara, M., 243 
Olson, E., 52, 142 

Orcutt, G. H., 247, 280, 282, 294, 324 

Papandreou, A. G., 12 
Pareto. V., 9 
Parsons, T., 6, 8 
Patinkin, I)., 143 
Pearl, R., 208 
Pearson, E. S., 20 
Pearson, K., 123 
Perlman, S., 13 
Perroux, F., 78. 79 
Pesmazoglu. J. S.. 69 
Phipps, C. G., 144 



362 


INDEX OF NAMES 


Pietra, G., 123 

Pigou, A. C., 8, 37, 73, 144 
Poincare, H., 8 
Pou, S. S., 4, 55, 109 
Prest, A. R., 324 

Quenouille, M. H., 247, 273, 295 
Quesnay, P., 63 

Radice, E. A., 69 
Rao, C. R., 96 
Reddaway, W. B., Ill 
Reder, M., 8, 57 
Reichenbach, H., 13, 16 
Reiersol, O., 69, 122, 123, 155 
Rhodes, E. C., Ill, 123, 198,209 
Ricci, U., 152 
Robbins, L. C., 12, 13, 51 
Robinson, G., 209, 217 
Robinson, J., 48, 143 
Rockefeller, D., 77 

Roos, C. F., 42, 43, 123, 268, 269, 270, 
272 

Rosenstein-Rodan, P. N., 14 

Rothschild, K. W., 4 

Rouquet La Garrigue, V., 37 

Roy, A., 55 

Roy, P. N„ 98 

Roy, R., 5, 37, 55, 109 

Roy, S. N., 98 

Rubin, H., 155, 156, 166, 173, 240, 243 
RuefF, J., 4 
Russel, B., 8 

Samuelson, P. A., 8, 14, 50, 58, 59, 60, 
70, 143, 152, 270, 275 
Sasuly, M., 191 
Savage, L. J., 59 
Scheffe, H., 188, 212, 234 
Schelting, A. von, 6 
Schneider, E., 53, 54 
Schoenberg, E. H., 152 
Schuetz, A., 6 

Schultz, H., 34, 37, 38, 39, 44 
Schultz, T. W., 77 
Schumacher, H., 13 
Schumpeter, J. A., 3, 190, 200, 216 
Schuster, A., 223 
Schwartz, S. M., 78 
Scitovsky, T. de, 8 
Shepherd, G. S., 37, 44 
Simon, H. A., 65 
Simons, H., 76 
Slutzky, E., 205 
Smith, H. M., 64 


Smith, J. H„ 186, 356 
Smith, V. E., 57 
Smithies, A., 33, 79, 275 
Smithies, N., 143 
Smythe, L. T., 60 

Snedecor, G. W., 21, 88, 248, 315 

Snyder, C., 138 

Sombart, W., 13 

Stackelberg, H. von, 54 

Staehle, H., 51 

Stamp, J., 12 

Stigler, G. J., 9, 43, 50, 54, 64, 73, 141 
Stone, R., 4, 5, 33, 40, 41, 79, 107, 111, 
124, 143, 146, 265, 326 
Strecker. H., 310 
Stuvel, G., 62 
Suranyi-Unger, T., 77 
Sweezy, P. M., 7, 13, 78 
Szeliski, V. von., 43, 272 

Tarshis, L., 144 
Thomson, G. H., 102 
Thorne, G. B., 38 
Thorp, W. L., 200 
Thurston, L. L., 57 
Tick, L. V., 22 

Tinbergen, J., 3, 69, 70, 79, 102, 120, 
143, 216. 266 
Tobin, J., 40 
Tschuprow, A. A., 83 
Tucker, A. W., 110 

Ullmer, M. J., 5 
Ullmo, J., 67 
Uven, M. J. van, 123 

Verhulst, M. J. J., 53 
Viner, J., 48 
Vining, R., 12 
Volterra, V., 208 

Wagemann, E., 98 

Wald, A., 22, 57, 60, 123, 124, 156, 
186, 205, 227, 229, 240, 265, 267, 
309 

Walker, A. M., 295 

Walker, G., 223, 261 

Wallis, W. A., 57, 234, 235 

Walras, L., 9 

Watson, G. S., 250 

Waugh, F. V., 58, 66, 1 17, 120, 301 

Weatherburn, C. E., 15, 258 

Weber, M., 6, 8 

Weinberger, O., 14 

Weyl, H., 8 



INDEX OF NAMES 


363 


Whitmann, R. H., 41 
Whittaker, E. T., 209, 217 
Whittle, P., 279 
Wicksell, K., 8 
Wicksteed, P. H., 53 
Wiener, N., 279 

Wilks, S. S., 83, 86, 87, 89, 93, 102, 
107, 114, 117, 131, 168 
Wilson, E. B., 224 
Winkler, W., 3, 4, 38, 58, 109 
Wishart, J., 89 
Wisniewski, J., 229 

Wold, H„ 186, 224, 255, 261, 275, 284 
285, 286, 288, 294, 296 


Wolfowitz, J., 240 
Wood, M. K., 51 
Working, E. J., 16, 155 
Working, H., 301 
Wyckoff, V., 77 
Wylie, K. H., 49 

Yates, F., 192, 248 
Yntema, T. O., 49 
Yugow, A., 78 

Yule, G. U., 3, 109, 239, 260 

Zaycoff, R., 310, 311 
Zrzavy, F. I., 229 



Subject Index 



Accounting, 79 
Addition of matrices, 333 
Aggregation, 4, 109 
Agricultural products, 132 
demand, 1 1, 45, 132 
price fixing, 1 1 
subsidy, 46 
supply, 45, 132 
tax, 46 

Agriculture, production function, 303 
American economy, production func¬ 
tion, 134 

Amplitude, 218, 222 

and moving average, 207 
Autocorrelated series, correlation co¬ 
efficient, 247 
covariance, 248 
relations, 247 
Autocorrelation, 187, 240 
circular test, 242 
non-parametric test, 240 
residuals, 250 
von Neumann ratio, 253 
Autocovariance, 187 
Automobiles, demand, 43 
Autoregression, linear, 255, 284, 294 
Autoregressive transformation, 323 
Averages, moving, 198, 285, 319 
and amplitude, 207 
goodness of fit, 286 
and random element, 203 
successive, 202 

Beef, demand, 39 
Beer, demand, 41 
price fixing, 4 1 
taxation, 4 1 

Blocks of transactions, 107 
Building, residential, 270 
Business annals, 200 


Business cycles, 70, 216 
simple theory, 72 

Canonical correlation, 114 
Central planning, 78 
Characteristic vector, 341 
computation, 352 
Choice, between models, 17, 186 
multiple, 22, 128, 191, 312 
Circular test for autocorrelation. 242 
Clerks, Engel curves, 62 
income elasticity, 62 
Clothing, demand, 62 
Coefficients, structural, 10, 121, 156 
Components, principal, 102 
Computation, 342 

characteristic vector, 352 
determinant, 348 
inverse matrix, 348 
latent roots, 352 
linear equations, 342 
powers of a matrix, 351 
systems of equations, 342 
transformation of variables, 346 
Confidence limits, 19, 30 
Consistent estimate. 86 
Constant marginal cost, 49 
Constant returns to scale, 141 
Constants, structural. 10, 121, 156 
Consumption, meat, 195, 220, 225, 242, 
245, 262, 267. 290, 298, 320 
and meat prices, 117 
pig iron, 69 

Convergence in probability, 86 
Copper deliveries, 271 
Corn, 249 

prices, 249, 258 
production, 268, 276 
stocks, 249. 259 
Correlation, canonical, 1 14 

coefficients, autocorrelated series, 247 


364 


SUBJECT INDEX 


365 


Correlation, lag, 187, 269 
multiple, 87 
partial, 91 
rank, 212 
.serial, 187, 269 
Correlogram, 187, 284 
hidden periodicities, 285 
linear autoregression, 294 
moving averages, 285 
Cost, hosiery, 48 
of living, 296 
marginal, 49 
steel, 49 

Cotton yarn, demand, 133, 134 
supply, 133, 134 

Covariance, autocorrelated series, 248 
serial, 187 

of series with first-order stochastic 
difference equations, 258 
Crout method, 342 
Cycle, 73, 216 

non-parametric test, 234 

Data, non-experimental, 15 
Decreasing marginal returns, 54 
Definitional relations, 9 
Deliveries, copper, 271 
Demand, agricultural products, 11, 45, 
132 

automobiles, 43 

beef, 39 

beer, 41 

clothing, 62 

cotton yarn, 133, 134 

food, 61, 62 

fuel, 62 

homogeneity, 143 
labor, 143 
light, 62 

meat, 169, 176, 282, 306, 327 

non food, 61 

pork, 39 

potatoes, 326 

rayon, 134 

steel, 42 

wheat, 38 

Demand functions, 36 
Determinant, 336 

computation, 348 


Determinantal equation, 339 
Diagonal matrix, 332 
Difference, 309 

Difference equation, stochastic, 265, 
284, 294 

arbitrary order, 265 
errors in the variables, 272 
first-order, 255 
goodness of fit, 295 
and process analysis, 275 
second-order, 260 
systems, 267 

Difference transformation, 325 
Differences, selection, 316 
variance, 3 1 1 
efficiency, 3 12 
exact test, 3 14 
large sample test, 310 
Differential equation, logistic, 209 
stochastic, 277 
Discriminant analysis, 96 
test, 97 

Dynamic model, 69 
United States, 71 

Econometrics, 3 
and economics, 3 
journals, 3 
and policy, 8, 76 
and statistics, 14, 15 
Economic policy, 8, 76 
Economics, and econometrics, 3 
and history, 13 
institutional, 13 
laws, 8, 1 3 
mathematical, 4, 7 
pure, 12 

and social science, 14 
statistical, 12 
verification, 13 
Economy, laissez faire, 76 
mixed, 76, 77 
planned, 78 
random, 185 

Efficiency, variance of differences. 312 
Efficient estimates, 86 
Elasticity, of demand, agricultural prod¬ 
ucts, 11, 18, 133 
automobiles, 43 
beef, 40 


366 


SUBJECT INDEX 


Elasticity, of demand, beer, 41 
corn, 276 

cotton yarn, 133, 134 
food, 61, 62 
labor, 150 
meat, 170, 283, 327 
non-food, 62 
potatoes, 326 
rayon, 134 
wheat, 38 

of production, agriculture, 132, 304 
American economy, 138 
manufacturing, 53 

of supply, corn, 276 
cotton yarn, 134 
labor, 151 
meat, 184 
sugar, 44 

Element, matrix, 331 

Endogenous variables, 155 

Engel curve, 60, 61 
clerks, 62 

Entrepreneur, 55 

• 

Equations, determinantal, 339 
errors in, 124, 156 
identification, 155 
just identified, 166 
linear, 335 

computation, 342 
over-identified, 172 
structural, 156 
Errors, correlated, 279 

in equation, 28, 85, 154, 156 
in variables, 28, 85, 122 

stochastic difference equations, 272 
Estimate, consistent, 86 
efficient, 86 
point, 18 
sufficient, 87 

Exogenous variables, 156 
Experiment, 10 

Factor, 102 
Farm price index, 230 
Farms, Iowa, production function, 

(1939), 56 
(1942), 53 

Fiducial limits, 19, 30, 88 
Finance, functional, 77 


Finite difference, 309 
transformation, 325 
variance, 311 
efficiency, 312 
exact test, 314 
large sample test, 310 
and von Neumann ratio, 253 
First-order stochastic difference equa 
tion, 255 

independence of series, 257 
Fisher test, 223 

Flour and wheat characteristics, 117 
Food, demand, 61, 62 
Form, reduced, 167 
Fourier analysis, 217 
Freight car loadings, 220, 224 

Frequency definition of probability, 17 
Fuel, demand, 62 

Functional finance, 77 
Games, 5 

Goodness of fit, moving average, 286 
stochastic difference equation, 295 

Hidden periodicities, 222 
correlogram, 284 
History and economics, 13 
Homogeneity, demand labor, 143 
production function, 52, 53, 91, 141 
supply labor, 6, 143 
Hosiery, cost, 48 
Hypothesis, test, 20 

Identification, 33, 123, 155, 156 
rule, 157 
Income elasticity, clerks, 62 

of demand, agricultural products, 133 
automobiles, 43 
beef, 40 
beer, 41 

meat, 170, 184, 307, 327, 328, 329 
potatoes, 426 
Income, national, 12 
Independence, series with first-order 
difference equations, 257 
Index, farm prices, 230 
wholesale prices, 200, 237 
Index number, 4, 109 
Indifference system, 60 
Industrial production, 210 



SUBJECT INDEX 


367 


Industry, control, 76 
Inference, non-parametric, 188 
statistical, 18 

Institutional economics, 13 
Introspection, 14 
Inverse matrix, 337 
computation, 348 

Jointly dependent variables, 155 
Just identified equation, 166 

Keynesian system, 6, 143 

Labor, demand, 143 
supply, 6, 143 
Lag correlation, 187, 269 
Laissez faire, 76 
Latent root, matrix, 340 
computation, 352 
Laws, economics, 13 
Least squares, 18, 27, 83, 125, 167, 191, 
200, 209, 218, 250, 265, 268, 
276, 302, 305, 319, 324 
correlated errors, 279 
Light, demand, 62 

Likelihood, maximum, 18, 26, 86, 106 
125, 173, 265 

Limits, confidence, 19, 30, 88 
fiducial, 19, 30, 88 
Linear autoregression, 255, 294 
goodness of fit, 295 
Linear equations, 335 
computation, 342 
homogeneous, 338 
Linear regression, multiple, 83 
simple, 24 

Linear relations, with multiple regres¬ 
sion, 89 

with weighted regression, 130 
Linear trend in multiple regression, 301 
Loadings, freight car, 220, 224 
Loans, 98 

Logic and mathematics, 8 
Logistic, 208 
differential equation, 209 
Logistics, 8 

Long waves, 189, 216 

Manufacturing, production function, 52 
Marginal cost, constant, 49 


Marginal productivity, 56, 138 
Marginal returns, decreasing, 54 
Marginal utility of money, 58 
Markoff theorem, 83 

I Mathematical economics, 4, 7 
Mathematics and logic, 8 
Matrix, 331 
addition, 333 
characteristic vector, 341 
computation, 352 
diagonal, 332 
element, 331 
inverse, 337 
computation, 3^8 
latent root, 340 
computation, 352 
multiplication, 334 
scalar, 333 
order, 33 1 
power, 334, 351 
subtraction, 333 
symmetrical, 332 
transposed, 332 
unit, 332 

Maximum likelihood, 18, 26, 86, 106 
125, 173, 265 

Mean squared amplitude, 222, 284 
Mean square successive difference, 252 
Meat, consumption, 195, 220, 225,'242, 

245, 251, 262, 267, 274/ 290,’ 
298, 320 

and prices, 117 

demand. 169, 177, 282, 306, 327 
prices, 254 

and consumption, 11 7 
supply, 171, 306 
Minimum wage, 40 
Minor, 336 
principal, 337 
Mixed economy, 77 
Model, 6 

Models, choice between, 186 
Monetary policy, 76 

Money, marginal utility, 58 
wages, 6, 143 

Movement, oscillatory, 187, 216 
periodic, 187, 216 
Moving average, 198. 285, 319 
and amplitude, 207 
goodness of fit, 286 


368 


SUBJECT INDEX 


Moving average, and random element 
203 

successive, 202 
weight, 199, 319 
Multicollinearity, 33 
test, 127 

Multiple choice, 22, 128, 191, 312 
Multiple correlation coefficient, 87 
test, 88 

Multiple regression, 32, 83 
and linear trend, 301 
and polynomial trend, 304 
test for linear relations, 89 
trends, 301 

Multiplication, matrix, 334 
Multiplier, 67 
Multivariate analysis, 93 

National income, 12 

blocks of transactions, 107 
Non-experimental data, 15 
Non-food, demand, 61 
Non-linear regression, 32, 190 
Non-parametric inference, 188 
Non-parametric test, autocorrelation 
240 

cycle, 234 
trend, 21 1 
Normalization, 339 
Numerical computation, 342 
characteristic vector, 352 
determinant, 348 
inverse matrix, 348 
latent root, 352 
linear equations, 342 
power of a matrix, 351 
systems of equations, 342 
transformation of variables, 346 

Order, matrix, 331 
Orthogonal polynomials, 190 
in multiple regression, 304 
Oscillatory movement, 187, 216 
Over-identified equation, 172 

Parameter, structural, 10, 121, 156 
Partial correlation, 91 
Passengers, streetcar, 229 
Peak, 234 

Periodic movement, 187, 216 


Periodogram, 222 

and autocorrelation, 284 
Phase, 234 

Pig iron, consumption, 69 
Planning, central, 78 
Point estimation, 18 

Policy, and econometrics, 8, 76 
monetary, 76 
social, 8 

Polynomial, orthogonal, 190 
Polynomial trends, 190 
in multiple regression, 304 
Pork, demand, 39 
price fixing, 40 
Positive score, 212 
Potatoes, demand, 326 
yield, 235 

Power of a matrix, 334, 351 
Predetermined variables, 156 
Prediction, 24 

Prices, in business cycle, 99 
and consumption, 117 
corn, 249, 258 
and stocks, 260 
consumption goods, 99 
production goods, 99 
stock, 213 
sugar, 193 

wheat flour, 241, 244, 313, 316 
wholesale, 200, 211, 214, 221, 246, 
263, 293, 299 

Price fixing, agricultural products, 11 
beer, 41 

pork, 40 
sugar, 44 
wheat, 39 

Price index, wholesale, 200, 214, 227 
Price indices, 113 

and production indices, 118 
Principal component, 102 
Principal minor, 337 
Probability, 16 

frequency definition, 17 
Process, stochastic, 277 
Process analysis and stochastic differ¬ 
ence equations, 275 
Production, corn, 276 
industrial, 210 

Production function, agriculture, 303 
American economy, 134 



SUBJECT INDEX 


369 


Production function, constant returns 
to scale, 54, 91, 141 
homogeneity, 54, 91, 141 
Iowa farms (1939), 56 
(1942), 53 
manufacturing, 52 
Production indices, 111 
and price indices, 118 
Productivity, marginal, 56, 138 
Propensity to consume, 67 
Pure economics, 12 

Random economy, 185 

Random element and moving averages, 
203 

Random sample, 185 
Random variability, reduction, 319 
Random variable, 18 
Rank, 212 

Rank correlation, 212 
Ratio of mean square successive dif¬ 
ference to variance, 252 
Rayon, demand, 134 

Real wages, 6, 143 
Reduced form, 167 

Reduction of random variability, 319 
Regression, linear, 24 
multiple, 32, 83 
non-linear, 32 
simple, 24 
weighted, 121 

Regression coefficient, 27, 84 
Regression coefficients, test, 88 
Regression constant, 27, 88 
Regression equation, 31, 88 
Relations, between autocorrelated 
series, 247 
definitional, 9 
structural, 9. 121, 156 
Residential building, 270 
Residuals, autocorrelation, 250 
Returns, marginal, 54 
to scale, 55, 91, 142 
Root, latent, 340 
computation, 352 
Rules for identification, 157 
Run, 234 

Sample, random. 185 
Sampling, 15 


Scalar, 333 

Scalar multiplication, 333 
Schuster test, 223 
Score, positive, 212 
total, 212 
Seasonal, 227 

Second-order stochastic difference equa¬ 
tion, 260 

Selection of differences, 315 

Serial correlation, 187, 269 

Serial covariance, 187 

Series, autocorrelated, relations, 247 

Ship freight, 131 

Significance test, 21 

Simple regression, 24 

Single stochastic difference equation 
265 

Social policy, 8 

Social science and econometrics, 13 
Stationary, 165, 186, 257, 268, 284, 
285, 295 

Statistical economics, 12 
Statistical inference, 18 

Statistical methods in econometrics, 15 
Steel, cost, 49 
demand, 42 

Stochastic difference equations, 255 
284 

errors in the variables, 272 
first-order, 255 
goodness of fit, 295 
and process analysis, 275 
second-order, 260 
single, 265 
systems, 267 

Stochastic differential equations, 277 
Stochastic processes, 277 
Stock prices, 213 
Stocks, corn, 249, 259 
and prices, 260 
Streetcar passengers, 229 
Structural coefficients, 10, 122, 156 
Structural constants, 10, 122, 156 
Structural equations, 10, 122, 156 
Structural parameters, 10, 121, 156 
Structural relations, 9, 121, 156 
Structure, 10 

Subsidy, agricultural products, 46 
Subtraction, matrix, 333 
Sufficient estimate, 87 


370 


SUBJECT INDEX 


Sugar, price, 193 
price fixing, 44 
supply, 44 

Supply, agricultural products, 45, 132 
cotton yarn, 133, 134 
homogeneity, 6, 143 
labor, 6, 143 
meat, 171, 306 
sugar, 44 

Symmetrical matrix, 332 
Systematic part, 124, 309 

Systems of stochastic difference equa¬ 
tions, 267 

/-test, 21, 29, 88, 129 
Tableau economique, 63 
Tariff, 8 

Tax, agricultural products, 46 
beer, 41 

Test, autocorrelation, 242 
Fisher, 223 
hypothesis, 20, 30 

linear relations in multiple regres¬ 
sion, 89 

multicollinearity, 127 
non-parametric, autocorrelation. 240 
cycle, 234 
trend, 2 1 1 
Schuster. 223 
significance, 2 1, 29 
variances of differences, exact, 314 
large sample, 310 
Walker, 223 
Theoretical model, 6 
Time series, 185 
Total score, 212 
Transactions, blocks, 107 
Transformation, autoregressive, 323 
difference, 325 
observations, 301 
variables, 346 
Transposed matrix, 332 
Trend, 189 

in multiple regression, 301 
linear, in multiple regression, 301 
non-parametric test, 211 
logistic, 208 
polynomial 


Trend, polynomial, in multiple regres 
sion, 304 
Trough, 234 

Unbiased, 84 
Unit matrix, 332 

Variables, endogenous, 155 

I exogenous, 156 

jointly dependent, 155 
predetermined, 156 
transformation, 346 
Variance, differences, 311 
efficiency, 312 
exact test, 314 
large sample test, 310 
Variate difference method, 125 135 

308 

Vector, 335 
characteristic, 341 
computation, 352 
Verification, 13 

von Neumann ratio, 252, 326 
and autocorrelation, 253 
and finite differences, 253 

Wages, minimum, 40 
money, 6, 143 
real, 6, 143 
Walker test, 223 
Waves, long, 189, 216 
Weight, moving average, 199, 319 
Weighted regression, 121 
linear relations, 130 

maximum likelihood estimates, 125 
multicollinearity, 127 
test, coefficients, 129 
Welfare economics, 8 
Wheat, demand, 38 

and flour characteristics, 117 
price, 288 
price fixing, 39 


Wheat-flour price, 241, 244, 

254. 

313. 

316 



Wholesale prices. 200, 21 I, 

214. 

221. 

222, 237. 246. 263, 
299 

277, 

293. 



atoes. 



ALLAMA IQBAL LIBRARY 



60962 

























